Signal processing is a ubiquitous part of modern technology. Its mathematical 
basis and many areas of application are the subject of this book, based on a 
series of graduate-level lectures held at the Mathematical Sciences Research 
Institute. Emphasis is on current challenges, new techniques adapted to new 
technologies, and certain recent advances in algorithms and theory. The book 
covers two main areas: computational harmonic analysis, envisioned as a tech- 
nology for efficiently analyzing real data using inherent symmetries; and the 
challenges inherent in the acquisition, processing and analysis of images and 
sensing data in general — ranging from sonar on a submarine to a neurosci- 
entist’s fMRI study. 
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Hyperbolic Geometry, Nehari’s Theorem, 
Electric Circuits, and Analog Signal Processing 

JEFFERY C. ALLEN AND DENNIS M. HEALY, JR. 



Abstract. Underlying many of the current mathematical opportunities in 
digital signal processing are unsolved analog signal processing problems. 
For instance, digital signals for communication or sensing must map into 
an analog format for transmission through a physical layer. In this layer 
we meet a canonical example of analog signal processing: the electrical 
engineer’s impedance matching problem. Impedance matching is the de- 
sign of analog signal processing circuits to minimize loss and distortion as 
the signal moves from its source into the propagation medium. This pa- 
per works the matching problem from theory to sampled data, exploiting 
links between H°° theory, hyperbolic geometry, and matching circuits. We 
apply J. W. Helton’s significant extensions of operator theory, convex anal- 
ysis, and optimization theory to demonstrate new approaches and research 
opportunities in this fundamental problem. 
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1. The Impedance Matching Problem 

Figure 1 shows a twin-whip HF (high-frequency) antenna mounted on a su- 
perstructure representative of a shipboard environment. If a signal generator is 
connected directly to this antenna, not all the power delivered to the antenna can 
be radiated by the antenna. If an impedance mismatch exists between the signal 
generator and the antenna, some of the signal power is reflected from the antenna 
back to the generator. To effectively use this antenna, 
a matching circuit must be inserted between the signal 
generator and antenna to minimize this wasted power. 

Figure 2 shows the matching circuit connecting the 
generator to the antenna. Port 1 is the input from the 
generator. Port 2 is the output that feeds the antenna. 

The matching circuit is called a 2 -port. Because the 
2-port must not waste power, the circuit designer only 
considers lossless 2-ports. The mathematician knows 
the lossless 2-ports as the 2x2 inner functions. The 
matching problem is to find a lossless 2-port that trans- 
fers as much power as possible from the generator to 
the antenna. 

The mathematical reader can see antennas every- 
where: on cars, on rooftops, sticking out of cell phones. 

A realistic model of an antenna is extremely complex 
because the antenna is embedded in its environment. 

.. 11111 i Courtesy of Antenna Products 

fortunately, we only need to know how the antenna be- 

Figure 1 

haves as a 1-port device. As indicated in Figure 2, the 

antenna’s scattering function or reflectance sl characterizes its 1-port behavior. 
The mathematician knows sl as an element in the unit ball of H°°. 

Figure 3 displays sl '■ — > C of an HF antenna measured over the frequency 

range of 9 to 30 MHz. (Here j = +V~ T because i is used for current.) At 
each radian frequency u> = 2i r/, where / is the frequency in Hertz, sl(j'w) is a 




Figure 2. An antenna connected to a lossless matching 2-port. 
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complex number in the unit disk that specifies the relative strength and phase 
of the reflection from the antenna when it is driven by a pure tone of frequency 
to. SL(ju>) measures how efficiently we could broadcast a pure sinusoid of fre- 
quency u> by directly connecting the sinusoidal signal generator to the antenna. 
If |sl(ju;)| is near 0, almost no signal is reflected back by the antenna towards 



s i= lpdl7fwd4J 




the generator or, equivalently, almost all of the signal power passes through the 
antenna to be radiated into space. If |sl(jw)| is near 1, most of this signal is 
reflected back from the antenna and so very little signal power is radiated. 

Most signals are not pure tones, but may be represented in the usual way 
as a Fourier superposition of pure tones taken over a band of frequencies. In 
this case, the reflectance function evaluated at each frequency in the band mul- 
tiplies the corresponding frequency component of the incident signal. The net 
reflection is the superposition of the resulting component reflections. To ensure 
that an undistorted version of the generated signal is radiated from the antenna, 
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the circuit designer looks for a lossless 2-port that “pulls s^(juj) to 0 over all 
frequencies in the band.” As a general rule, the circuit designer must pull 
inside the disk of radius 0.6 at the very least. 

To take a concrete example, the circuit designer may match the HF antenna 
using a transformer as shown in Figure 4. If we put a signal into in Port 1 




Figure 4. An antenna connected to a 



matching transformer. 



of the transformer and measure the reflected signal, their ratio is the scattering 
function si. That is, Si is how the antenna looks when viewed through the trans- 
former. The circuit designer attempts to find a transformer so that the “matched 
antenna” has a small reflectance. Figure 5 shows the optimal transformer does 
provide a minimally acceptable match for the HF antenna. The grey disk shows 
all reflectances |s| < 0.6 and contains .sq (ju>) over the frequency band. 

However, this example raises the following question: Could we do better with a 
different matching circuit? Typically, a circuit designer selects a circuit topology, 
selects the reactive elements (inductors and capacitors), and then undertakes a 
constrained optimization over the acceptable element values. The difficulty of 
this approach lies in the fact that there are many circuit topologies and each 
presents a highly nonlinear optimization problem. This forces the circuit designer 
to undertake a massive search to determine an optimal network topology with 
no stopping criteria. In practice, often the circuit designer throws circuit after 
circuit at the problem and hopes for a lucky hit. And there is always the nagging 
question: What is the best matching possible? Remarkably, “pure” mathematics 
has much to say about this analog signal processing problem. 



2. A Synopsis of the H°° Solution 

Our presentation of the impedance matching problem weaves together many 
diverse mathematical and technological threads. This motivates beginning with 
the big picture of the story, leaving the details of the structure to the subse- 
quent sections. In this spirit, the reader is asked to accept for now that to 
every A-port (generalizing the 1- and 2-ports we have just encountered), there 
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-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 



9t:/=9-30 MHz; ; n=1.365 

Figure 5. The reflectance sl (solid line) of an HF antenna and the reflectance 
si (dotted line) obtained by a matching transformer. 

corresponds an TV x TV scattering matrix S £ H°°{ C + ,C NxN ), whose entries 
are analytic functions of frequency generalizing the reflectances of the previous 
section. Mathematically, S : C + — > (£ NxN j s a mapping from open right half 
plane C+ (parameterizing complex frequency) to the space of complex TV x TV 
matrices that is analytic and bounded with sup-norm 



ll'S'lloo := ess.sup{||S(ju;)|| : u £ i} < oo. 



For a 1-port, S is scalar-valued and, as we saw previously, is called a scattering 
function or reflectance. Scattering matrix entries for physical circuits are not 
arbitrary functions of frequency. The circuits in this paper are linear, causal, 
time- invariant, and solvable. These constraints force their scattering matrices 
into H °° ; see [3; 4; 31]. 
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Figure 6 presents the schematic of the matching 2-port. The matching 2-port 
is characterized by its 2 x 2 scattering matrix 



S(jv) 



Sn(ju) S 12 (jw) 
S 2 i(ju) S 22 (ju) 



The matrix entries measure the output response of the 2-port. For example, s 22 




Figure 6. Matching circuit and reflectances. 

measures the response reflected from Port 2 when a unit signal is driving Port 2; 
s 12 is the signal from Port 1 in response to a unit signal input to Port 2. If the 
2-port is consumes power, it is called passive and its corresponding scattering 
matrix is a contraction on jR: 

S(ju>) H S(ju) < 

almost everywhere in frequency (a.e. in oj), or equivalently that S belongs to the 
closed unit ball: S € BH°°(C + ,C 2x2 ). The reflectances of the generator and 
load are assumed to be passive also: sg, sl G BH°°( C+). Because the goal is 
to avoid wasting power, the circuit designer matches the generator to the load 
using a lossless 2-port: 

S(ju) H S(ju) = 

Scattering matrices satisfying this constraint provide the most general model for 
lossless 2-ports. These are the 2x2 real inner functions, denoted by U + {2) C 
H°°( C + ,C 2x2 ). The circuit designer does not actually have access to all of 
U + ( 2) through practical electrical networks. Instead, the circuit designer op- 
timizes over a practical subclass It C U + (2). For example, some antenna ap- 
plications restrict the total number d of inductors and capacitors. In this case, 
IX = U + (2,d) consists of the real, rational, inner functions of Smith-McMillan 
degree not exceeding degree d {d defined in Theorem 6.2). 

The figure-of- merit for the matching problem of Figure 6 is the transducer 
power gain Gt defined as the ratio of the power delivered to the load to the 



1 0 
0 1 



a.e. 



1 0 
0 1 
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maximum power available from the generator [44, pages 606-608] : 

1-| SG | 2 1-| SL | 2 



Gt(sg, S, sl ) := | ^2i | 



|1 - SlS G | 2 |1 - S22Si| 2 ’ 



(2-1) 



where Si is the reflectance seen looking into Port 1 of the matching circuit at 
the load sl terminating Port 2. This is computed by acting on sl by a linear- 
fractional transform parameterized by the matrix S: 



si = 3q(S, s L ) := sn + Si 2 Sl (1 - S22Sl) 1 s 2 i • (2-2) 

Likewise, looking into Port 2 with Port 1 terminated in s G gives the reflectance 
s 2 = 3 r 2(<5, Sg ) := S22 + S2lS G (l — SiiS G ) _1 Si2. (2-3) 



The worst case performance of the matching circuit S is represented by the 
minimum of the gain over frequency: 

I|G t (s g ,S',s l )||_ 00 := ess.inf { | G t (s g , S', s L ; ju>)\ : uj G M}. 



In terms of this gain we can formulate the Matching Problem: 

Matching Problem. Maximize the worst case of the transducer power gain 
Gt over a collection IX C U + (2) of matching 2-ports: 

sup{||Gr(s G , 5, sl)||_ 00 : S G IX}. 



The current approach is to convert the 2-port matching problem to an equivalent 
1-port problem and optimize over an orbit in the hyperbolic disk. Specifically, 
the transducer power gain can be written 

G t (s g , S, s l ) = 1 - A P(5 2 (S, s g ), s l ) 2 = 1 - A P(s G , ^(5, s L )) 2 , 



where the power mismatch 



AP(si,s 2 ) 



Si - 52 
1 - SiS 2 



is the pseudohyperbolic distance between si and S 2 - The orbit of the generator’s 
reflectance s G under the action of IX is the set of reflectances 



? 2 (XX,s G ) := {T 2 (S,s g ) : 5 € IX} C BH°°( C+). 



Thus, the matching problem is equivalent to maximizing the transducer power 
gain over this orbit. The transducer power gain is bounded as follows: 

su P {||Gt(s g , S, sl)||_oo : S G IX} = 1 - inf{|| AP(T 2 (S, s G ), s L )|& : S' C 11} 

= 1 - inf{||AP(s 2 ,si,)||L ; s 2 G T 2 (XX,s g )} 

< 1 - mf{||AP(s 2 , Si)||^ : s 2 G BH °°( €+)}. 

Expressing matching in terms of power mismatch in this way manifests the un- 
derlying hyperbolic geometry approximation problem. The reflectance of the 
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generator is transformed to various new reflectances in the hyperbolic disk un- 
der the action of the possible matching circuits. We look for the closest approach 
of this orbit to the load sl with respect to the (pseudo) hyperbolic metric. The 
last bound is reducible to a matrix calculation by a hyperbolic version of Ne- 
hari’s Theorem [42], a classic result relating analytic approximation to an oper- 
ator norm calculation. The resulting Nehari bound gives the circuit designer an 
upper limit on the possible performance for any class l( C U + { 2) of matching 
circuits. For some classes, this bound is tight, telling the circuit designer that 
the benchmark is essentially obtainable with matching circuits from the specified 
class. For example, when 11 is the class of all lumped lossless 2-ports (networks 
of discrete inductors and capacitors) 

[/+( 2,oo) := |J 17+ (2, d) 

< 2>0 

and sq = 0, Darlington’s Theorem establishes that 

sup{||G T (s G = 0, S', 5^) H—oo : S G 17 + (2,oo)} 

= 1 - inf{|!AP( S2 , Si )||^ : s 2 G BH °°{ €+), 

provided sl is sufficiently smooth. In this case, the circuit designer knows that 
there are lumped, lossless 2-ports that get arbitrarily close to the Nehari bound. 
The limitation of this approach is the requirement that the generator reflectance 
sq = 0, which is not always true. Thus, a good research topic is to relax this 
constraint, or to generalize Darlington’s Theorem. Another limitation of the 
techniques described in this paper is that the Nehari methods produce only a 
bound -they do not supply the matching circuit. However, the techniques do 
compute the optimal s 2 , leading to another excellent research topic — the “uni- 
tary dilation” of s 2 to a scattering matrix with s 2 = s 22 . That such substantial 
research topics naturally arise shows how an applied problem brings depth to 
mathematical investigations. 

3. Technical Preliminaries 

The real numbers are denoted by R. The complex numbers are denoted by 
C. The set of complex M x N matrices is denoted by C MxN . In and On denote 
the N x N identity and zero matrices. Complex frequency is written p = a+ju>. 
The open right-half plane is denoted by C + := {p G C : Re [p] > 0}. The open 
unit disk is denoted by D and the unit circle by T. 

3.1. Function spaces. 

• L°°(jR) denotes the class of Lebesgue-measurable functions defined on j R 
with norm ||0||oo ;= ess.sup{|^(j±)| : to G R}. 

• C’o(jR) denotes the subspace of those continuous functions on jR that vanish 
at ±oo with sup norm. 
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• H°°( C + ) denotes the Hardy space of functions bounded and analytic on C + 
with norm H/iHoo := sup{|/i(p)| : p G C + }. 

H°°( C+) is identified with a subspace of L°°(jR) whose elements are obtained by 
the pointwise limit h(jw) — lim CT ^o h(a + ju>) that converges almost everywhere 
[39, page 153]. Convergence in norm occurs if and only if the H°° function has 
continuous boundary values. Those H°° functions with continuous boundary 
values constitute the disk algebra : 

• jdi(C+) := l+H°°(C + )nC 0 (jR) denotes those continuous H°°( C+) functions 
that are constant at infinity. 

These spaces nest as 



■Ai(C+) C H°°( C+) C L°°(jR). 

Tensoring with C MxAr gives the corresponding matrix- valued functions: 
L°°(jR,C MxN ) := L°°(jR) ®C MxN 

with norm H^Hoo := ess.sup{|j</>(jo;)|| : to C M} induced by the matrix norm. 

3.2. The unit balls. The open unit ball of L°°(jR,C MxN ) is denoted as 

C MxJV ) := G L 00 (jK,C MxJV ) : < l} . 

The closed unit ball of L°°(jR, C MxN ) is denoted as 

BL°°(jR,C MxN ) ■= G L°°(jR, C MxN ) : < l} . 

Likewise, the open unit ball of H°°( C+,C MxN ) is 

BH°°( C + ,C MxN ) := BL°°(jR,C MxN ) fl H°° (C + ,C MxN ). 

3.3. The real inner functions. The class of real H°°( C + ,C MxN ) functions 
is denoted 

ReH°°(C+,C MxN ) = {S G H°°(C+,C MxN ) : S(fij = S(p)}. 

A function S G H°°{ C + ,C MxN ) is called inner provided 

S(ju) H S(ju>) = I N a.e. 

The class of real inner functions is denoted 



U+(N) := {S G Re BH°°(C + ,C NxN ) : S{ju) H S(jco) = I N a.e.}. 
Lemma 3.1. U + (N) is closed subset of the boundary ofReBH°°(C + ,C NxN ). 
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PROOF. It suffices to show closure. If {S'm} C U + (N) converges to S € 
H°°{ C + ,C NxN ), then S m {juj) — > S(ju>) almost everywhere so that 

I N = lim S m (ju>) H S m (jiv) = S(joj) H S(ju>) a.e. 

m — >oo 

That is, S(ju>) is unitary almost everywhere or S £ U + (N). □ 

3.4. The weak-* topology. We use the weak-* topology on L°°(j R) = 
L^jR)*. A weak-* subbasis at 0 £ ^“(jR) is the collection of weak-* open sets 

0[w,s] := {(p £ L°°(jR) : |(w,^>)| < e}, 

where e > 0, w £ L 1 ( jM. ), and 

/ OO 

w(juj)<f>(juj)du). 

-OO 

Every weak-* open set that contains 0 £ L°°(j R) is a union of finite intersections 
of these subbasic sets. The Banach- Alaoglu Theorem [47, Theorem 3.15] gives 
that the unit ball R) is weak-* compact. The next lemma shows that the 

same holds for a distorted version of the unit ball, a fact that will have significant 
import for the optimization problems we consider later. 

Lemma 3.2. Let c, r £ with r > 0 define the disk 

D{c , r) := {4> € L°°(jR) : \<j> — c\ < r a.e.}. 

Then D(c,r) a closed, convex subset of L°°(j R) that is also weak-* compact. 

PROOF. Closure and convexity follow from pointwise closure and convexity. 
To prove weak-* compactness, let M r : L°°(j R) — > L°°(j R) be multiplication: 
M r (j) := r<f>. Observe D(k,r ) = k + M r BL°°(jM.). Assume for now that M r is 
weak-* continuous. Then M r BL°°(jM.) is weak-* compact, because BL°°(j R) 
is weak-* compact, and the image of a compact set under a continuous function 
is compact. This forces D(k, r ) to be weak-* compact, provided M r is weak-* 
continuous. To see that M r is weak-* continuous, it suffices to shows that M r 
pulls subbasic sets back to subbasic sets. Let e > 0, w € L^j'R). Then 

ip £ £ 0[w,e] \(w,rip)\ < e 

<t=> | (to, ip}\ < e <t=> ip £ 0[rw, e], 



noting that rw £ L^jR). □ 

If K is a convex subset L°°(j R), then K is closed <t=> K is weak-* closed [17, 
page 422]. Because H°°( C+) is a closed subspace of L°°(C + ), is it also weak-* 
closed. Intersecting weak-* closed H°°( C+) with the weak-* compact unit ball 
of L°°(jR) forces BH°°( C+) to be weak-* compact. 
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3.5. The Cayley transform. Many computations are more conveniently 
placed in function spaces defined on the open unit disk D rather than on the 
open right half-plane C+. The notation for the spaces on the disk follows the 
proceeding nomenclature with the unit disk D replacing C + and the unit circle 
T replacing jM. H°°{ D) denotes the collection of analytic functions on the 
open unit disk with essentially bounded boundary values. C(T) denotes the 
continuous functions on the unit circle, M(D) := 7?°°(D)nC'(T) denotes the disk 
algebra, and L°°(T) denotes the Lebesgue-measurable functions on the unit circle 
T with norm determined by the essential bound. A Cayley transform connects 
the function spaces on the right half plane to their counterparts on the disk. 



Lemma 3.3 ([27, page 99]). Let the Cayley transform c : C+ — > D 



c (p) ■= 



P~ 1 
p+ 1 



extend to the composition operator c : L°°(T ) — > as 



h(p):=Hoc(p) {p = ju). 



Then c is an isometry mapping 



r -a(d) 




r ^i(c+) ) 


| H°°{ D) 


> onto < 


H°°{ C+) | 


C(T) 


i+C 0 (M) 


l i°°( T) J 




. L°°m) J 



3.6. Factoring H°° functions. The boundary values and inner-outer factor- 
ization of H°° functions are notions most conveniently developed on the unit 
disk and then transplanted to the right half-plane by the Cayley transform [35]. 
Let 4> G L 1 ( T) have the Fourier expansion in z = exp (jd) 



OO 

<t>{z) = 

n =— oo 




e - J '"V(e j0 ) — ■ 

27T 



For 1 < p < oo, define H p ( D) as the subspace of L p ( T) with vanishing negative 
Fourier coefficients [27, page 77]: 

H P (D) := {h G L p { T) : h(n) = 0 for n = -1, -2, . . . }. 



Then H p ( D) is a closed subspace of L P (T) and as [27, page 3]: 

H°°( T) C H P2 ( T) C H Pl (T) C H x { T) (1 < Pl < p 2 < oo) 



Each h G H p ( D) admits an analytic extension on the open unit disk [27, p. 77]: 

OO 

h( z ) = h(n) z n (z = re : ’ e ). 

71=0 

From the analytic extension, define h r (e? e ) := h{re^ s ) for 0 < r < 1. For r < 1, 
h r is continuous and analytic. As r increases to 1, h r converges to h in the L p 
norm, provided 1 < p < oo. For p = oo, h r converges to h in the weak-* topology 
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(discussed on page 10). If h r does converge to h in the L°° norm, convergence 
is uniform and forces h £ .A(D). Although disk algebra A(D) is a strict subset 
of H°°(D) in the norm topology, it is a weak-* dense subset. 

If <f> is a positive, measurable function with log(</>) £ L^T) then the analytic 
function [48, page 370]: 

q{z) = exp e eJt + _l log \<j>(e jt ) \ (z £ D), 

is called an outer function. The magnitude of q(z) matches (j> [48, page 371]: 
lim |qy(re 76, )| = cj>(re (a.e.) 

r—>l 



and leads to the equivalence: (j) £ L P (T) •<=>■ q e H p ( D). We call q(z) a spectral 
factor of 4>. Every h £ H°°( D) admits an inner-outer factorization [48, pages 
370-375]: 

h{z) = e-' So b(z)s(z)q(z), 

where the outer function q{z) is a spectral factor of |/i| and the inner function 
consists of the Blaschke product [48, page 333] 



b(z) := z k 



— Z Z„ 



1 - z n z z n ’ 



z n 7^ 0, — \ z n\) < oo, and the singular inner function 

s(z) = exp J ^ ^±ldii{t)j , 

for p a finite, positive, Borel measure on T that is singular with respect to 
the Lebesgue measure. In the electrical engineering setup, we will see that the 
Blaschke products correspond to lumped, lossless circuits while a transmission 
line corresponds to a singular inner function. 



4. Electric Circuits 

The impedance matching problem may be formulated as an optimization of 
certain natural figures of merit over structured sets of candidate electrical match- 
ing networks. We begin the formulation in this section, starting with an ex- 
amination of the sorts of electrical networks available for impedance matching. 
Consideration of various choices of coordinate systems parameterizing the set of 
candidate matching circuits leads to the scattering formalism as the most suit- 
able choice. Next we consider appropriate objective functions for measuring the 
utility of a candidate impedance matching circuit. This leads to description and 
characterization of power gain and mismatch functions as natural indicators of 
the suitability of our circuits. With the objective function and the parameteriza- 
tion of the admissible candidate set, we are in position to formulate impedance 
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matching as a constrained optimization problem. We will see that hyperbolic 
geometry plays a natural and enabling role in this formulation. 

4.1. Basic components. Figure 7 represents an iV-port — a box with N 
pairs of wire sticking out of it. The use of the word “port” means that each 
pair of wires obeys a conservation of current — the current flowing into one 
wire of the pair equals the current flowing out of the other wire. We can imagine 




Figure 7. The IV-port. 



characterizing such a box by supplying current and voltage input signals of given 
frequency at the various ports and observing the current and voltages induced 
at the other ports. Mathematically, the JV-port is defined as the collection fNT of 
voltage v(p) and current i(p) vectors that can appear on its ports for all choices 
of the frequency p = a + juj [31]: 

NC L 2 (JR,C N ) x L 2 (jR,C N ). 

If N is a linear subspace, then the IV-port is called a linear N- port. Figures 8 
and 9 present the fundamental linear 1-ports and 2-ports. These examples show 



+ ? 




-6 



+ ? 



i( P ) 



v(p) ST\ 



c 



+ J 




Figure 8. The lumped elements: resistor v(p) = Ri(p)\ capacitor i(p) = pCv(jp)\ 
inductor v(p) = pLi(p). 

that N can have the finer structure as the graph of a matrix-valued function: for 
instance, with the inductor IN' is the graph of the function i{p) i— > pLifp). 
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i\(p) hip) 




hip) 

o-> 

+ 

vi ip) 




hip) 

+ 

v 2 ( p) 



Figure 9. The transformer and gyrator. 



More generally, if the voltage and current are related as v(p) = Z{p)i{p ) 
then Z(p) is called the impedance matrix with real and imaginary parts Z(p) = 
R(p)+jX(p) called the resistance and reactance, respectively. If the voltage and 
current are related as i(p) = Y ip)~v(p) then Y ( p ) is called the admittance matrix 
with real and imaginary parts Y{p) = B{p) + jG(p) called the conductance and 
susceptance, respectively. The chain matrix T(p) relates 2-port voltages and 
currents as 



Vi 




tnip) 


*12 (p) 


V2 


h 




_ hi (p) 


*22 ip) _ 


. *2 



The ideal transformer has chain matrix [3, Eq. 2.4]: 



Vl 




n 1 


0 ' 




V 2 


ii 




0 


n 




-*2 



(4-1) 



where n is the turns ratio of the windings on the transformer. The gyrator has 
chain matrix [3, Eq. 2.14]: 



Vl 




0 


a 




V2 


h 




a -1 


0 




*2 



Figure 10 shows how the 1-ports can build the series and shunt 2-ports with 
chain matrices 



hip) 



Vi ip) 





hiP) jlij>) 


zip ) 


< O ^ 



v 2 ( P) 



nip) 



yip) 



hijh 
v 2 ( P) 



O- 



-O 



o- 



-o 



Figure 10. Series and shunt 2-ports. 



sfa) 



1 z{p) 
0 1 



Tshunt ip) 



1 0 
yip) 1 



using the using the impedance z(p) and admittance y(p). Connecting the series 
and shunts in a “chain” produces a 2-port called a ladder. The ladder’s chain 
matrix is the product of the individual chain matrices of the series and shunt 2- 
ports. For example, the low-pass ladders are a classic family of lossless matching 
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2-ports. Figure 11 shows a low-pass ladder with Port 2 terminated in a load z^. 
The low-pass ladder has chain matrix 



rrm 


cr~r~r~^ 


l J ~ 


+ L \ 




^3 + | 


— > v i 7 


''c, 


< 

' c, V2 < 


6 








Figure 11. A low-pass ladder terminated in a load. 



Tip) 



1 pLi 


1 


0 ' 


1 pL 2 


1 O' 


1 pL 3 


0 1 


. pC x 


1 _ 


0 1 


. P c 2 1 . 


0 1 



The impedance looking into Port 1 is computed 




tllZL + t\2 
t'2lZL + t 2 2 



=:S (T,z l ). 



Thus, the chain matrices provide a natural parameterization for the orbit of the 
load Zl under the action of the low-pass ladders. Section 1 showed that these 
orbits are fundamental for the matching problem. Even at this elementary level, 
the mathematician can raise some pretty substantial questions regarding how 
these ladders sit in U + (2) or how the orbit of the load sits in the unit ball of 
H°°. 

Unfortunately, the impedance, the admittance, and the chain formalisms do 
not provide ideal representations for all circuits of interest. For example, there 
are IV-ports that do not have an impedance matrix (i.e., the transformer does 
not have an impedance matrix). There are difficulties inherent in attempting 
the matching problem in a formalism where the some of the basic objects under 
discussion fail to exist. 

In fact, much of the debate in electrical engineering in the 1960’s focused 
on finding the right formalism that guaranteed that every iV-port had a repre- 
sentation as the graph of a linear operator. For example, the existence of the 
impedance matrix Z(p) is equivalent to 








: i € L 2 (jR,C N ) 



but this formalism is not so useful when we need to describe circuits with trans- 
formers in them. The claim is that any linear, passive, time-invariant, solvable 
AT-port always admits a scattering matrix S € BH°°{ C + ,C NxN )-, see [3; 4; 31]. 
Consequently, we work the matching problem in the scattering formalism, which 
we now describe. 
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4.2. The scattering matrices. Specializing to the 2-port in Figure 12, define 




Figure 12. The 2-port scattering formalism. 




(4-2) 

(4-3) 



The scattering description can be readily related to other representations when 
the latter exist. For instance, the scattering matrix determines the impedance 
matrix as 

Z := R~ 1/2 ZR~ 1/2 = (/ + S)(I - S)- 1 . 

To see this, invert Equations 4-2 and 4-3 and substitute into v = Z\. Conversely, 
if the iV-port admits an impedance matrix, normalize and Cayley transform to 
get 

S = (Z ~ I)(Z + i)- 1 . 

Usually, Ro = r$I with ro = 50 ohms so the normalizing matrix disappear. 
The math guys always take ro = 1. The EE’s have endless arguments about 
normalizations. Unless stated otherwise, we’ll always normalize with respect to 
U). 



1 Two accessible books on the scattering parameters are [3] and [4]. The first of these 
omits the factor ^ but carries this rescaling onto the power definitions. Most other books 
use the power-wave normalization [16]: a = R 0 1//2 {v + Zoi}/2, where the normalizing matrix 
Zq = Rq + jXo is diagonal with diagonal resistance Ro > 0 and reactance Xq. 
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4.3. The chain scattering matrix. Closely related to the scattering matrix 
is the chain scattering matrix 0 [25, page 148]: 

bl _ q °2 _ On 0\2 02 

ftl b-2 0-21 O22 b2 

When multiple 2-ports are connected in a chain the chain scattering matrix of the 
chain is the product of the individual chain scattering matrices. The mappings 
between the scattering and chain scattering matrices are [25] : 



—l \ — det [S] sn 



1 - 0 2 



Although every 2-port has a scattering matrix, it admits chain scattering matrix 
only if S21 is invertible. 



4.4. Passive terminations. In Figure 6, Port 2 is terminated with the load 
reflectance sl so that 

«2 = s L b 2 - (4-5) 



Then the reflectance looking into Port 1 is obtained by the chain-scattering 
matrix: 



bl _ 01102 + 012&2 _ OhSl + 012 
Ol 02102 + 02202 O21SL + 022 



9i(0, sl)- 



Equation 4-4 also allows us to express si in terms of the linear-fractional form 
of the scattering matrix introduced in Equation 2-2: si = 34 (S', sl)- Similarly, 
if Port 1 of the 2-port is terminated with the load reflectance s G , then the 
reflectance looking into Port 2 is 



s 2 = 82(0, s G ) 



022 SG + 021 
012 SG + 011 



W* G ), 



with S 2 (S, s g ) as introduced in Equation 2-3. 



4.5. Active terminations. Equation 4-5 admits a generalization to include 
the generators. Figure 13 shows the labeling convention of the scattering vari- 
ables. The generalization includes the scattering of the generator in terms of the 




Figure 13. Scattering conventions. 
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voltage source [16, Eq. 3.2]: 

- 1/2 
j' ' 

b G = sq&g + CG ; C G '■= — — V G - (4-6) 

z G + r 0 

1/2 

To get this result, use Equations 4-2 and 4-3 to write v\ = r 0 (ai + b±) and 
i i = r 0 ' (oi — &i). Substitute this into the voltage drops v G = z G i\ + V\ of 
Figure 13 to get 

- 1/2 

r 0 v G z G -r 0 

c G = ; = Oi ; h = b G ~ s G a G - 

zg + r 0 z G + r 0 

We can now analyze the setup in Figure 13. Equations 4-5 and 4-6 give 





ai 




SG 


0 




' h " 




CG 


a = 




= 








+ 




. °2 . 




0 


S L 




^2 




°L 



Substitution into b = S& solves the 2-port scattering as 

a =(I 2 -S x S)- 1 c x . 

4.6. Power flows in the 2-port. With respect to an IV-port, the complex 
power 2 is [4, page 241]: 

W{p) := v(p) H i(p). 

Because v(p) has units volts second and i(p) has units amperes second, W(p) 
units of watts/Hz 2 . The average power delivered to the IV-port is [21, page 19] 

P avg ; = 1 Re [IT] = 4{a H a - b H b} = \sl h {I - S H S} a. (4-7) 

We’re dragging the 1/2 along so our power definitions coincide with [21]. If the 
7V-port consumes power (P av g > 0) for all its voltage and current pairs, then the 
IV-port is said to be passive. If the IV-port consumes no power (P avg = 0) for all 
its voltage and current pairs, then the IV-port is said to be lossless. In terms of 
the scattering matrices [28]: 

• Passive: S H < In 

• Lossless: S H {ju)S{ju}) = In 

for all ui € K. Specializing these concepts to the 2-port of Figure 14, leads to 
the following power flows: 

• The average power delivered to Port 1 is 

P 1 :=i(|a 1 | 2 -|5 1 | 2 ) = ^(l-| Sl | 2 ). 

• The average power delivered to Port 2 is 

P 2 : = l(|a 2 | 2 -| & 2| 2 ) = -P i . 



2 Baher uses [3, Eq. 2.17]: W(p) = i(p) H v(p). 
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Figure 14. Matching circuit and reflectances. 

• The average power delivered to the load is [21, Eq. 2.6.6] 

PL:=H\^\ 2 -\bL\ 2 )= l ^f(l~\s L \ 2 ). 

• The average power delivered by the generator: 

PG = \(\b G \ 2 -\a G \ 2 ). 

To compute Pg, observe that Figure 14 gives ag = b\ and bo = a i. Substitute 
these and b\ = siai into Equation 4-6 to get eg = (1 — sgSi)<ii- Then 

Pa = 1(M 2 - w 3 ) = bub(l - |„| 2 ) = Efd . (4-8) 

2 2 |l-s G Sir 

Lemma 4.1. Assume the setup of Figure 14. There always holds P 2 = — Pl and 
Pg = Pi- If the 2-port is lossless , Pi + P 2 = 0. 

4.7. The power gains in the 2-port. The matching network maps the 
generator’s power into a form that we hope will be more useful at the load 
than if the generator drove the load directly. The modification of power is 
generically described as “gain.” The matching problem puts us in the business of 
gain computations, and we need the maximum power and mismatch definitions. 
The maximum power available from a generator is defined as the average power 
delivered by the generator to a conjugately matched load. Use Equation 4-8 to 
get [21, Eq. 2.6.7]: 

PG , max := Pg\s 1= s^ = “ |s&'| 2 ) _1 - 

The source mismatch factor is [21, Eq. 2.7.17]: 

Pg _ (1-| Sg | 2 )(1-M 2 ) _ 

PG, max |1 ~ SG s l| 2 

The maximum power available from the matching network is defined as the 
average power delivered from the network to a conjugately matched load [21, 
Eq. 2.6.19]: 

Pl , max := Pl\s^ ■= - M 2 ). 
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Less straightforward to derive is the load mismatch factor [21, Eq. 2.7.25]: 

Pl _ (i-KI 2 )(i-N 2 ) _ 

Pl , max |1 ~ SlS 2 | 2 

These powers lead to several types of power gains [21, page 213]: 



• Transducer power gain 
r p L 

(_T X • = 



power delivered to the load 



Pq max maximum power available from the generator 
• Power gain or operating power gain 



Cp:=^ = 



power delivered to the load 



P\ power delivered to the network 

• Available power gain 

Pl, max maximum power available from the network 



Ga~ 



P G max maximum power available from the generator 
Lemma 4.2. Assume the setup of Figure 14. If the 2-port is lossless, 

(l-| SG | 2 )(l-| Sl | 2 ) 



G t = 



|1 - SGSlI 



Proof. 



G t = 



Pl Lemma 4.1 P 2 lossless Pl Lemma 4.1 Pg 



Pg,, 



P, 



G,max 



Pg,v 



Pg,, 



□ 



What’s nice about the proof is that it makes clear that the equality holds because 
the power flowing into the lossless 2-port is the power flowing out of the 2-port. 
The key to analyzing the transducer power gain is the power mismatch. 

4.8. Power mismatch. Previously we established that the power mismatch 
is the key to the matching problem. In fact, this is a concept that brings to- 
gether ideas from pure mathematics and applied electrical engineering, as seen 
in the engineer’s Smith Chart — a disk-shaped analysis tool marked with coordi- 
nate curves which look compellingly familiar to the mathematician. A standard 
engineering reference observes the connection [51]: 

The transformation through a lossless junction [2-port] . . . leaves invariant 
the hyperbolic distance . . . The hyperbolic distance to the origin of the 
[Smith] chart is the mismatch, that is, the standing-wave ratio expressed 
in decibels: It may be evaluated by means of the proper graduation on 
the radial arm of the Smith chart. For two arbitrary points W \ , Wi, the 
hyperbolic distance between them may be interpreted as the mismatch that 
results from the load W 2 seen through a lossless network that matches Wi 
to the input waveguide. 
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Hyperbolic metrics have been under mathematical development for the last 200 
years, while Phil Smith introduced his chart in the late 1930’s with a somewhat 
different motivation. It is fascinating to see how hyperbolic analysis transcribes 
to electrical engineering. Mathematically, we start with the pseudohyperbolic 
metric 3 on D defined as follows (see [58, page 58]): 



P(si,s 2 ) 



Sl ~ 52 
1 - sls 2 



(si, s 2 G D). 



The Mobius group of symmetries of D consists of all maps g : D — > D [20, 
Theorem 1.3]: 



where a £ D and 9 £ I. That p is invariant under the Mobius maps g is 
fundamental (see [20] and [58, page 58]): 



P(g(si),g(s 2 )) = p(si,s 2 ). 

The hyperbolic metric 4 on D is [58, page 59]: 

a , \ i , A + p(si,s 2 ) 

<3(^) = 5 l°g 

Because p is Mobius- invariant, it follows that [3 is also Mobius-invariant: 



(4-9) 



/?(g(si),g(s 2 )) = /3 (si,s 2 ). 



One can visualize the matching problem in terms of the action of this group 
of symmetries. At fixed frequency, a given load reflectance sl corresponds to a 
point in D. Attaching a matching network to the load modifies this reflectance 
by applying to it the Mobius transformation associated with the chain scattering 
matrix of the matching network. By varying the choice of the matching network, 
we vary the Mobius map applied to Sl and sweep the modified reflectance around 
the disk to a desirable position. 

The series inductor of Figure 10 provides an excellent example of this action 
of a circuit as Mobius map acting on the reflectances parameterized as points 
of the unit disk. The series inductor has the chain scattering matrix [25, Table 
6 . 2 ]: 



0(p) 



1 - Lp/2 Lp/2 
-Lp/2 1 + Lp/2 



that acts on s G D as 



S (0;s) 



Ons + Oi 2 
@21 S + © 22 



a s — a 
a 1 — as 



“=(l+i2/(wL))- 1 



3 Also known as the Poincare hyperbolic distance function; see [50]. 

4 A1so known as the Bergman metric or the Poincare metric. 
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Figure 15 shows the Mobius action of this lossless 2-port on the disk. Frequency 
is fixed at p = j. The upper left panel shows the unit disk partitioned into 
radial segments. Each of the other panels show the action of an inductor on 
the points of this disk. Increasing the inductance warps the radial pattern to 
the boundary. The radial segments are geodesics of p and (3. Because the 
Mobius maps preserve both metrics, the resulting circles are also geodesics. More 
generally, the geodesics of p and (3 are either the radial lines or the circles that 
meet the boundary of the unit disk at right angles. 




-1 - 0.5 0 0.5 1 

95 




-1 - 0.5 0 0.5 1 

95 



L = 2 L= 3 




Figure 15. Mobius action of the series inductor on the unit disk for increasing 
inductance values (frequency fixed at p = j). 



Several electrical engineering figures of merit for the matching problem are 
naturally understood in terms of the geometry of the hyperbolic disk. We are 
concerned primarily with three: (1) the power mismatch, (2) the VSWR, (3) the 
transducer power gain. The power mismatch between two passive reflectances 
Si, s 2 is [29]: 

si - s 2 

1 - SiS 2 



AP(si,s 2 ) : 



P0l,S2), 



(4-10) 
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or the pseudolryperbolic distance between si and s 2 measured along their geo- 
desic. Thus, the geodesics of p attach a geometric meaning to the power mis- 
match and illustrate the quote at the beginning of this section. 

The voltage standing wave ratio (VSWR) is a sensitive measure of impedance 
mismatch. Intuitively, when power is pushed into a mismatched load, part of the 
power is reflected back measured by the reflectance s € D. Superposition of the 
incident and reflected wave sets up a voltage standing wave pattern. The VSWR 
is the ratio of the maximum to minimum voltage in this pattern: [6, Equation 
3.51]: 

VSWR( S ) = 2Olog lo (^0 [dB], 

Referring to Figure 15, the VSWR is a scaled hyperbolic distance from the origin 
to s measured along its radial line. Thus, the geodesics of j3 attach a geometric 
meaning to the VSWR. 

The transducer power gain Gt links to the power mismatch A P by the clas- 
sical identity of the hyperbolic metric [58, page 58]: 



1 - p(si,s 2 ) 2 



(i-H 2 )(i-M 2 ) 

|1 - sls 2 | 2 



(si, s 2 € D), 



(4-11) 



and Lemma 4.2 provided the matching 2-port is lossless. 

Lemma 4.3. If the 2-port is lossless in Figure 14, Gt = 1 — A P(sq, si) 2 . 

That is, maximizing Gt is equivalent to minimizing the power mismatch. As the 
next result shows, we can use either Port 1 or Port 2 (Proof in Appendix B). 

Lemma 4.4. Assume the 2-port is lossless in Figure 6: S £ U + (2). Assume 
sq and, sl are strictly passive: s G , sl € BH °°{ C + ). Then Si = sl) and 
s 2 = SF 2 (S, sg) ( defined in Equations 2-2 and 2-3 respectively) are well-defined 
and strictly passive with the LFT ( Linear Fractional Transform ) law 



A Pisc^S, s L )) = A P(T 2 (5,s g ),s l ) 



and the TPG ( Transducer Power Gain) law 



G t (s g , S, s l ) = 1 - A P(s G , Ti (S, s L )) 2 = 1 - AP(J 2 (S, s g ),s l ) 2 



holding on j'M. 

The LFT law is not true if S is strictly passive. For S H S < J 2 , define the gains 
at Port 1 and 2 as follows: 



Gi(s g , S, s l ) := 1 — AP(s g , 3 i(S, s l )) 2 
G 2 (s g , S, s l ) := 1 - A P(T 2 (S, s g ), s l ) 2 . 
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Lemma 4.4 gives that Gp = G i = G 2 , provided S is lossless. If S is only passive, 
we can only say Gt < Gi,G 2 . To see this, Equation 4-11 identifies G\ and G 2 
as mismatch factors: 

Gi{s g , S , s L ) = 1 - A P(s G , Sl ) 2 = 

J G.max 



G 2 (sg, 5, S L ) := 1 - A P(s 2 , SL ) 2 = • 

*L, max 

If we believe that a passive 2-port forces the available gain Ga < 1 and power 
gain Gp < 1 of Section 4.7, the inequalities Gp < Gi, G 2 are explained as 



G t = 



Pt, 



P, 



L, max 



Pt. 



Pga 



Pg ,max Pl,, 



— G A G 2 



Gt 



Pl 

Pg ,max 



Pi Pl _ r r 

p ~5~ ~ GpGi. 

JG.max D 



4.9. Sublevel sets of the power mismatch. We have just seen that 
impedance matching reduces to minimization of the power mismatch. We can 
obtain some geometrical intuition for the behavior of this by examining Fig- 
ure 16, which shows the isocontours of the function s 2 A P(s 2 , sp) for a fixed 

reflectance sp in the unit disk (at a fixed frequency). The key observation is 
that for each fixed frequency, the sublevel sets {s 2 € D : AP(s 2 ,sp) < p} com- 
prise a family of concentric disks with hyperbolic center sp. Of course, we must 
actually consider power mismatch over a range of frequencies. To this end, the 
next lemma characterizes the corresponding sublevel sets in 



Lemma 4.5 (AP Disks). Let sl G BL°°(jM.). Let 0 < p < 1. Define the center 
function 

k:=s L G BL°° (jR) , (4-12) 

1 - p-\s L \- 

the radius function 

r: =P /"l/Ka e BL™ (jR) , (4-13) 

1 ^ P z \sl\ z 

and the disk 



D(k,r) ~ {fi G L°°(jR) : \<j>(jw) - k(ju) \ < r(ju)}. 



Then , 

D-l: D(k,r) is a closed, convex subset of L°°(jR). 

D-2: D(h,r) = {<j> e BL°°(jR) : p> ||AP(0,sp)||oo}. 

D-3: D(k,r ) is a weak-* compact, convex subset of L°°(jR). 
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9? 

Figure 16. Sublevel sets of AP(s 2 , sl) in the unit disk. 

PROOF. Under the assumption that ||sl||oo < 1, it is straightforward to verify 
that the center and radius functions are in the open and closed unit balls of 
L°°(j] R), respectively. 

D-l: Convexity and closure follow from pointwise convexity and closure. 

D-2: Basic algebra computes D(k,r) = {</>€ L°°(jR) : p > || AP( 0 , sl) ||oo}- 
The “free” result is that ||T ) (fc,r)|| 00 < 1. To see this, let s := ||sl||oo- The norm 
of any element in D(k,r) is bounded by 

Moo + Moo < f-2 + Py ^~2 =■ u(s,p). 

1 — p A s A 1 — p z s z 



du —1 + s 2 

dp {ps+ 1) 2 ' 



For s £ [0,1) fixed, we obtain 
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Thus, u(s, o) attains its maximum on the boundary of [0, 1]: u(s, 1) = 1. Thus, 
PMIloo < 1. 

D-3: D-l and Lemma 3.2. □ 

4.10. Continuity of the power mismatch. Consider the mapping A p : 
BL°°(jWL) -> M+ 

A p(s 2 ) := ||AT > (s 2 ,s l )|| 00 , 

for fixed sl € BL°°(jW). The main problem of this paper concerns the min- 
imization of this functional over feasible classes (ultimately, the orbits of the 
reflectance under classes of matching circuits). This problem is determined by 
the structure of the sublevel sets of A p. What we have just seen is that the 
sublevel sets are disks in function space, a very nice structure indeed. As the 
“level” of A p is decreased, these sets neck down; the question of existence of a 
minimizer in a feasible class comes down to the intersection of the feasible class 
with these sublevel sets. 

Definition 4.1. [48, pages 38-39], [57, page 150] Let 7 be a real or extended- 
real function on a topological space X. 

• 7 is lower semicontinuous provided {x £ X : 7(2) < a} is closed for every real 

a. 

• 7 is lower semicompact provided {x £ X : j(x) < a} is compact for every 
real a. 

These properties produce minimizers by the Weierstrass Theorem. 

Theorem 4.1 (Weierstrass). [57, page 152] Let K be a nonempty subset of 
a a topological space X. Let 7 be a real or extended-real function defined on K. 
If either condition holds: 

• 7 is lower semicontinuous on the compact set K, or 

• 7 is lower semicompact, 

then inf{7(a;) : x £ K} admits minimizers . 

Lemma 4.5 demonstrates that A p is both weak-* lower semicontinuous and weak- 

• lower compact. The minimum of A p in BL°°(jl R) is 0 = A p(sjf) that corre- 
sponds to a perfect match over all frequencies. However, the matching functions 
at our disposal are not arbitrary, and this trivial solution is typically not ob- 
tainable with real matching circuits. The constraints on allowable matching 
functions lead us to consider minimizing A p restricted to BH°°{ C+), BAi(C + ), 
and associated orbits. Finally, straight-forward sequence arguments show that 
A p is also continuous as a function on BL°°(fM . ) in the norm topology. 

Lemma 4.6. If sl € BL°°(jMI), then A p : BL°°(j]R) — > R + is continuous. 

PROOF. Define A Pi : BL°°(jM.) —> L°°(jK) as APi(s) := (s — Sl)(1 — ssl ) _1 - 
If we show that A Pi is continuous then composition with || o || oo shows continuity 
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of A p. The first task is to show A P\ is well-defined. For each s £ BL°°(jR), 
AP 1 (s) is measurable and 

2 2 
_ l - ||s||ooI|sl||oo “ i-||sl||oo’ 

Thus, APi(s) £ L°°(jR) so is well-defined. For continuity, let {s n } C BL°°(jR ) 
and s n —* s. Then 



s — Sl 
1 - ss L 



AP 1 (s„)-AP 1 (s) 



S n — SL S — Sl 

1 — S n SL 1 — SSl 



1 

(l-S n Si)(l-SSi) 



{(s„-SL)(l-SS L )-(s-S L )(l-S n S L )} 



(l-S„Si)(l-SSi) 



{s„-s+SL(ss n -s n s)-l-(s-s„)s|} . 



In terms of the norm, 



||AP 1 (s n ) — APi(s)|| 

A (1 |sl||oo) s||oo T || &L || oo || SS n 5 n s|| 00 T ||s S n ||oo || &L 1 1 oo }> 

so that the difference converges to zero. With APi a continuous mapping, the 
continuity of the norm || o : L°°(jR) — > K + makes the mapping A p(s) := 
|| APi (s) ||oo also continuous. □ 



5. H°° Matching Techniques 

Recalling the matching problem synopsis of Section 2, our goal is to maximize 
the transducer power gain Gt over a specified class If of scattering matrices. By 
Lemma 4.3, we can equivalently minimize the power mismatch: 

su P {||Gt(sg, S, Sl)||-oo :56U} = 1 - inf{|| AP(T 2 (S, s G ), s L )||^ : 5 £ If} 

= 1 - inf{|| AP(s 2 , Sl)||to : s 2 £ T 2 (U,s G )} 

< 1 - mf{||AP( S2 , Si )||^ : s 2 £ BH °°( €+)}. 

The next step in our program is to develop tools for computing the upper bound 
at the end of this chain of expressions, based on what we know of Sl ■ Ultimately, 
we will try to make this a tight bound given the right properties of the admissible 
matching circuits parameterized by If. The key computation is a hyperbolic 
version of Nehari’s Theorem that computes the minimum power mismatch from 
the Hankel matrix determined by Sl- 

We start towards this in Section 5.1 by reviewing the concept of Hankel op- 
erators and their relation to best approximation from H°° as expressed by the 
linear Nehari theory. Section 5.2 extends this to a nonlinear framework that in- 
cludes the desired hyperbolic Nehari bound on the power mismatch as a special 



case. 
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Having computed a bound on our ability to match a given load, we consider 
how closely one can approach this in a practical implementation with real cir- 
cuits. The key matching circuits we consider in practice are the lumped, lossless 
2-ports with scattering matrices in I/ + (2,oo). Later on, Section 7 demonstrates 
that the orbit of sg = 0 under t/ + (2,oo) is dense in the real disk algebra, 
Rei?yii(C + ) (Darlington’s Theorem), so that smallest mismatch approachable 
with lumped circuits is 

inf{||AP(s 2 ,si,)||oo : s 2 G T 2 (P + (2, oo), 0)} 

= inf{|| AP(s 2 , Sl)||oc> : s 2 G Re IMi(C + )}. 

If we can relate the latter infimum to the minimization over the larger space 
H°°( C_|_), then minimizing the power mismatch over the lumped circuits can be 
related to the computable hyperbolic Nelrari bound. This seems plausible from 
experience with the classical linear Nelrari Theory, where <j> real and continuous 
implies that the distance from the real subset of disk algebra is the same as the 
distance to H°°: 



\\(j) — i7 00 (C + )|| 00 = \\(j> — Re^l 1 (C + )|| 00 . 

Section 5.3 obtains similar results for the nonlinear hyperbolic Nelrari bound 
using metric properties of the power mismatch AP. 

Thus, the results of this section will provide the desired result: the Nelrari 
bound for the matching problem is both computable and tight in the sense that 
a sequence of lumped, lossless 2-ports can be found that approach the Nelrari 
bound. 

5.1. Nehari’s theorem. The Toeplitz and Hankel operators are most con- 
veniently defined on P 2 (T) using the Fourier basis. Let </> G P 2 (T) have the 
Fourier expansion 

OO 

<t>{z)= ( z = e jB ). 

71— — 00 

Let P denote the orthogonal projection of P 2 (T) onto H 2 ( D): 

OO 

P(j>{z) = ^2^(n)z n . 

n—0 

The Toeplitz operator with symbol 4> G L°°(T) is the mapping 7^ 

H 2 { D) 

T^/r:=P(#). 

The Hankel operator with symbol <j> G P°°(T) is the mapping 
H 2 { D) 



: P 2 (D) 
: H 2 ( D) 



‘H'ph := U(I — P)(<j>h) 
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where U : H 2 ( D) 1 - -4 H 2 ( D) is the unitary “flipping” operator: 

Uh (z) := z-'hiz- 1 ). 

These operators admit matrix representations with respect to the Fourier basis 
[56, page 173]: 

m m m 

0(-l) 0(0) 0(1) 

0(-2) 0(-l) 0(0) 

and [56, page 191] 

0(-i) 0(— 2) 0(-3) 

_ 0(— 2) 0(— 3) 0(— 4) 

0(-3) 0(— 4) 0(— 5) 

The operator norm is 

ll^ll := suplH^/illoc : h e 

The essential norm is 

IITC^He := inf{||fl{^ — A"|| : A' is a compact operator}. 

The following version of Nehari’s Theorem emphasizes existence and uniqueness 
of best approximations. 

Theorem 5.1 (Nehari [56; 45]). If 0 € A°°(T), then 0 admits best approxi- 
mations from H°°(D) as follows: 

N-l: ||0-H‘ oo (D)|| oo = ||fl{ 0 ||. 

N-2: ||0 - {H“( D) + C(T)}|| 00 = ||^||e. 

N-3: If llJf^He < || || then best approximations are unique. 

Thus, Nehari’s Theorem computes the distance from 0 to using the 

Hankel matrix. However, solving the matching problem with lumped circuits 
forces us to minimize from the disk algebra ./1(D). Because the disk algebra is a 
proper subset of H°°( D), there always holds the inequality: 

||0-A(D)||oo>||0-i7 oo (D)|| oo =:||Jf 0 ||. 






Fortunately for our application, equality holds when 0 is continuous. 
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Theorem 5.2 (Adapted from [39, pages 193-195], [33; 34]). If <f> £ l+C'o(jR), 
]]</> — •A 1 (C+)|| 00 = \\(j) — 17 00 (C + )|| 00 
and there is exactly one h £ H°°{ C+) such that 

||</>- Ai(C + )|| 00 = | - h(ju ) | a.e. 



Thus, continuity forces unicity and characterizes the minimum by the circularity 
of the error <f> — h. To get existence in the disk algebra requires more than 
continuity. Let 4> : R. — > C be periodic with period 27 t. The modulus of continuity 
of (j> is the function [18, page 71]: 

oj{(f-,t) := sup{|0(fi) - <t>{t 2 ) | : , t 2 £ K, \h - t 2 \ < t}. 

Let A q denote those functions that satisfy a Lipschitz condition of order a G 
(0,1]: 

— <p(t 2 )\ < A\ti — t 2 \ a ■ 

Let C n+a denote those functions with G A a [5]. Let C u denote those 
functions that are Dini- continuous: 

t)t~ 1 dt < oo, 

for some e > 0. A sufficient condition for a function (f>(t) to be Dini-continuous 
is that be bounded [19, section IV. 2]. Carleson & Jacobs have an amazing 

paper that addresses best approximation from the disk algebra [5]: 

Theorem 5.3 (Carleson & Jacobs [5]). If G L°°(T), then there always 
exists a best approximation h G H°°{ D): 

||^-/ l ||o 0 = ||^- J ff 00 (D)|| 00 . 

If G C(T), then the best approximation is unique. Moreover, 

(a) : If (j) G C w then h G C f. 

(b) : If (t> (n) G C w then h G C u . 

(c) : If 0 < a < 1 and <f> G A Q then h G A a . 

(d) : If 0 < a < 1, n G N, and (f) G C n+a then h G C n+a . 

As noted by Carleson & Jacobs [5]: “the function-theoretic proofs ... are all 
of a local character, and so all the results can easily be carried over to any 
region which has in each case a sufficiently regular boundary.” Provided we 
can guarantee smoothness across ±joo, Theorem 5.3 carries over to the right 
half-plane. 

Corollary 5.1. If <j> £ l+Co(j’M), then the best approximation 
U-h\\ 00 = \\<j ) -H°°(C + )\\ 00 
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exists and is unique. Moreover, if <j> oc 1 G C u , then hoc 1 G C w so that 
||0 ~ ^lloo = 110 - ^°°(C + )||oo = 110 - ^l(C + )||oo. 

Thus, the smoothness of the target function 0 is invariant under the best approx- 
imation operator of H°°. 

5.2. Nonlinear Nehari and simple matching bounds. Helton [28; 31; 29; 
32] is extending Nehari’s Theorem into a general Theory of Analytic Optimiza- 
tion. Let T : j'R x C — > R + be continuous. Define 7 : A 00 (JR) — > R + U 00 
by 

7 (h) := ess.sup{T(jw, h(jcu)) : u> G R}. 
and consider the minimization of 7 on I\ C A 00 (JR): 

min{7(0) : 0 G A'}. 

Helton observed that many interesting problems in electrical engineering and 
control theory have the form of this minimization problem and furthermore in 
many cases the objective functions have sublevel sets that are disks [32]: 

[7 < a] : = {0 € BL°°(jR) : 7(0) < a} = D(c a ,r a ). 

This is certainly the case for the matching problem. For a given load si G 
A?A°°(JR), we want to minimize the worst case mismatch 

7 (s 2 ) = A p(s 2 ) ~ ess.sup{A P(s 2 (jw),s L (ju>)) : u) G M} 

over all s 2 G BH°°{ C+). In this special case, Lemma 4.5 shows explicitly that 
the sublevel sets of A p are disks. These sublevel sets govern the optimization 
problem. For a start, the sublevel sets determine the existence of minimizers. 

Lemma 5.1. Let 7 : BL°°(fM.) — » R. Assume 7 has sublevel sets that are disks 
contained in HA°°(jR): 

[7 <a]=D(c cn r Q )CBL co (jR). 

Then 7 has a minimizer h m - ln G BH°°{ C+). 

PROOF. Lemma 3.2 gives that 7 is lower semicontinuous in the weak-* topology. 
Because BH °°( C + ) is weak-* compact, the Weierstrass Theorem of Section 4.10 
forces the existence of H°° minimizers. □ 

In particular, an H°° minimizer of power mismatch does exist. This is only the 
beginning; we’ll see that the disk structure of the sublevel sets also couples with 
Nehari’s Theorem to to characterize such minimizers using Helton’s fundamental 
link between disks and operators. Ultimately, this line of inquiry permits us to 
calculate the matching performance for real problems. 
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Theorem 5.4 (Helton [29, Theorem 4.2]). Let C, P, R £ L°°(T,C NxN )- 
Assume P and R are uniformly strictly positive. Define the disk 

D(C, R, P) := {$ e L°°(T, C NxN ) : ($ - C)P 2 (T - C) H < R 2 } 

and R(jcv) := R(—joj). Then 

0 ^ D(C,R,P)nH°°{ D,C JVxJV ) XcTpUKb < T A 2, 

For the impedance matching problem, 7 is the power mismatch A P whose sub- 
level sets are contained in BL°°(jM.): 

D(c a ,r a ) fl BH°°(C + ) = D(c a , r a ) n H°°(C + ). 

Consequently, in our problem the unit ball constraint may be ignored and we may 
apply Theorem 5.4 specialized to the disk theory under this stronger assumption. 

Corollary 5.2. Let 7 : BL°°(jM.) — > R. Assume 7 has sublevel sets that are 
disks : 

[l <ct] = D{c ai r a ) C BL°°(JR.). 

Let C a := c a o c _1 and R a = r a o c ^ 1 where c is the Cayley transform of 
Lemma 3.3. Assume R a is strictly uniformly positive with spectral factor Q a £ 
H°°( D): R a = \Q a \. Then the following are equivalent : 

(A): %,r tt )nM“(C + )/0 

(b): < ^2 

(C): \\Q~ l Ca~ -ff 00 (D)|| 00 < 1. 

PROOF. By Theorem 5.4, all that is needed is to prove (a)<t=^(c). If (a) is 
true, there exists an H G BH°°( D) such that | H — C a \ < R a = \Q a \ a.e. 
Because R a is strictly uniformly positive on T, we may divide by \Q a \ to get 
\Q~ 1 H — Q~ 1 ^a\ < 1 a.e. Because Q a is outer, Q~ l H £ H°°( D) so that(c) must 
be true. Conversely, suppose (c) is true. Because Q a is outer, Q^Ca £ L°°(jR). 
The Cayley transform of Nehari’s Theorem forces the existence of a G £ H°°( D) 
such that ||G — Q~ x C a ||oo < 1. Because Q a is outer, H = Q a G £ H°°( D) and 
| H - C a \ < R a a.e. Then H £ D(C a ,R a ) n H°°{ C). Because D(C a ,R a ) 
is assumed to be contained in the unit ball of L°°(T), the Cayley transform 
forces(a) to hold. □ 

Part (b) amounts to an eigenvalue test that admits a nice graphical display of the 
minimizing a. Let Ai„f(a) denote the smallest “eigenvalue” of T^ 2 — TCc a ^Cc a - 
A plot of a versus Ai n f(a) reveals that Ai n f(a) is a decreasing function of a that 
crosses zero at a minimum. The next result verifies this assertion regarding the 
minimum. 

Corollary 5.3. Let 7 : PL°°(jM) — > R. Assume 7 has sublevel sets that are 
disks contained in BL°°(j] R): 

[7 <a}=D(c a ,r a )CBL°°(jR). 
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Then 7 has a minimizer h m i n € BH°°{ C+): 

Tbh°° '■= min {7 ( h ) '■ h e BH °°( €+)}. 

Let c m ; n and r m i n denote the L°°(jM.) center and radius functions of the sublevel 
disk at the minimum level : [7 < Tbh°°\‘ Cet C a := c a o c _1 and R a = r a o c _1 
where c zs i/ie Cayley transform of Lemma 3.3. Assume R m ; n is strictly uniformly 
positive with spectral factor Qmin- Tften i/ie following are equivalent : 

Min-1: S(c min , r mi „) n BH°° ^ 0 
Min-2: 0 = Ai n f ( 7 / 5 ) 

Min - 3 : || Q ^ n C min - // 00 ( D )|| 00 = 1 . 

Moreover, if Qf^ n C m \ n £ C( T) the minimizer h m i n is unique. 

PROOF. Min-1 =7 Min-3: If the inequality were strict, \C m i n —H\ < R m i n a.e. for 
some H £ H°°( D). Then h = H o c belongs to H°°( C+) and drops 7 below 
its minimum: 7 (/i) < a m i n . This contradiction forces equality at the minimum. 
Min-3 =7 Min-1: Corollary 5.2. 

Min-1 =7 Min-2: Theorem 5.4 forces ^c min Wc' min — ^ in or 0 < Ai n f( 7 ^ HOO ). 

This operator inequality is equivalent to 1 > ||34 q- 1 C min || [29, page 42], By 
Nehari’s Theorem, 1 > IIIKq-^cwJ = ll < 9mi n C 'min - = 1, where the 

equivalence of Min-1 and Min-3 gives the last equality. Thus, the inequality must 
be an equality. Min-2 =7 Min-1: 0 = A inf ( 7 5ffoo ) forces 1 = II^Q-^cwJI- By 

Nehari’s Theorem, 1 = |Qmi n C' m j n — U 00 (D)|| 00 . The Cayley transform of Ne- 
hari’s Theorem gives an H £ H°°( D) such that 1 = \\Qf n \ n C m [ u — HW^. Multiply 
by the spectral factor to get R m ; n = C' m ; n — Q m inH\\ or that D(C m , n , R m , n ) fl 
H°°( D) 7 ^ 0 . Use the assumption that the sublevel sets are contained in the 
close unit ball to get Min-1. For unicity, Min-3 forces H m i n = h m i n o c _1 to be 
a minimizer of 1 = ||Q~ f n (7 m in - ff 00 ( D )|| 00 = ||Q“J n C'min - #min||oo- Because 
C m in is continuous, the Cayley transform of Corollary 5.1 forces unicity. □ 

Lumped matching circuits have continuous scattering matrices. This requires us 
to constrain our minimization of power mismatch yet further to the disk algebra. 
For minimization of a general 7 over the disk algebra, we always have 

7sffoo < 7 ba, : = inf {70) : h £ BAi{C + )}. 

Under smoothness and continuity conditions, equality between the disk algebra 
and H°° can be established. 

Corollary 5.4. In addition to the assumptions of Corollary 5.3, assume 
Qmin C’min D ini- continuous . Then 

Tbh°° = TbAi = min {70) : h £ BA 1 (C+)}. 
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PROOF. By Corollary 5.3, there is a unique minimizer H m ; n G H°°( D) 

1 = \\Q~l D C min - H°°( DJIU = IIQ-LCmin - #min||oo. 

By Corollary 5.1, Dini-continuity forces H mu] to be Dini-continuous or h m i n = 
H oc G Ai(C + ), Thus, the inclusion of the H°° minimizer in the disk algebra 

forces 7 bh~ =7bag D 

This is a useful general result, but for our matching problem the requirement 
of Dini-continuity can in fact be relaxed. An easier approach, specialized to the 
case of 7 is the power mismatch, gives equality between the minimum over the 
disk algebra and that over H°° using only continuity (proof in Appendix D). 

Theorem 5.5. Assume sl G PAi(C+). Then 

min{||AP(s 2 ,SL)||oo : s 2 G BH°°( C+)} = inf { 1 1 AP(s 2 , Sl)||oo : s 2 G BA 1 (C+)}. 

5.3. The real constraint. Examination of the circuits in Section 4 shows the 
scattering matrices are real: S(p) = S(p) In fact, the scattering matrices that 
are used in the matching problem must satisfy this real constraint. Those H°° 
functions satisfying this real constraint form a proper subset ReI7 00 (C+), which 
generally forces the inequality: 

inf { ||0 - h\\oo : h G ReH°°(C+)} > ||0 - H°°( C + )|| 00 

However, equality is obtained provided 0 is also real. That the best approxi- 
mation operator preserves the real constraint is an excellent illustration of the 
general principle: That the best approximation operator preserves symmetries. 

Lemma 5.2. Let (X, d) be a metric space. Assume A : X — > X is a contractive 
map: d(A(x),A(y)) < d(x,y). Let V C X be nonempty. Define dist(a:,V) := 
inf{d(x,u) : v G V}. Assume 

A-l: V is A-invariant: A(V) C V. 

A-2: x £ X is also A-invariant A(x) = x. 

Then equality holds: dist(a:, A(V)) = dist(a:, V). 

PROOF. Let {v n } be a minimizing sequence: d(x,v n ) — » dist(a:,V). Because 
x is A-invariant, d(x,A(v n )) — d(A(x), A(v n )) < d(x,v n ) —> dist(a;,V). Thus, 
dist(a:, A(V)) < dist(a:, V) forces equality. □ 

Lemma 5.2 makes explicit the structure to handle the real constraint in the 
matching problem. 

Corollary 5.5. If sl G B Re L°° (j'R) , there holds 

inf{|| AP(s 2 ,s i )|| 00 : s 2 G PAi(C+)} = inf { 1 1 AP(s 2 , Sl)^ : s 2 G RePAi(C+)}. 
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PROOF. Apply Lemma 5.2 identifying BL°°(j R) as the metric space, <j>(ju>) = 
<j)(ju>) as the contraction, ReBAi(C + ) as the “-invariant subset, and sl as the 
“-invariant target function. Recall that the power mismatch AP(s 2 , sl) is the 
pseudohyperbolic metric p{s 2 ,sl) (Section 4.8). Because p is a metric, it fol- 
lows that \\p\\oo is also metric that is “-invariant: ||p(s 2 , sl) ||oo = ||p(s2i sl)||oo- 
The technical complication is that AP(s 2 ,Sl) is well-defined only when one 
of its arguments is restricted to the open unit ball B L°° (jWL) . With sl G 
BReL°°(jM.), Lemma 4.6 asserts that s 2 1 — > |j AP(s 2 , s_l)||oo is a continuous 
mapping on BL°°(j]R). Thus, we use continuity to drop the B constraint, apply 
Lemma 5.2 to the open ball with the real contraction and apply continuity 
again to close the open ball: 

inf{|| AP(s 2 , Sl)||oo : s 2 G ReBAi(C + )} 

Lemma 4.6 J nf { || A P(s 2 , ^ . S2 g R eB A 1 (C + )} 

Eq -i' 10 inf{||p(52,s L )|| 00 : s 2 G ReBA 1 (C+)} 

Corollary 5.5 j nf { || p( ^ ^ || ^ . S2 g BA 1 (C + )} 

Eq =~ W inf{|| AP(s 2 , sl)||oo : s 2 G BA 1 (C+)} 

Lemma 4.6 J nf { || A p^ gj . S2 g BA 1 (C+)}. □ 

Not surprisingly, Helton has also uncovered another notion of “real-invariance” 
for general nonlinear minimization [32]. 

6. Classes of Lossless 2-Ports 

The matching problems are optimization problems over classes of U + { 2): 

U + (2,d) c U + { 2,oo) C Z7 + (2) c Re BH°°(C + ,C 2x2 ). 

On the left, U + (2,d) corresponds to the lumped, lossless 2-ports. Optimization 
over this set represents an electrical engineering solution. On the right, the H°° 
solution provided in the last section is computable from the measured data but 
may not correspond to any lossless scattering matrix. The gap between the 
H°° solution and the various electrical engineering solutions may be closed by 
continuity conditions. 

The first result on gives the correspondence between the lumped TV-ports and 
their scattering matrices. 

The Circuit-Scattering Correspondence [52, Theorems 3.1, 3.2]. Any 
N-port composed of a finite number of lumped elements ( positive resistors, ca- 
pacitors, inductors, transformers, gyrators) admits a real, rational, lossless scat- 
tering matrix S G U + (N). Conversely, to any real , rational, scattering matrix 
S G U + {N) there corresponds an N-port. composed of a finite number of lumped 
elements 
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This equivalence permits us to delineate the following class of lossless 2-ports by 
their scattering matrices: 

U + {2,d) := {S G U+( 2) : deg SM [S(p)] < d}, 

where deg SM [S'(p)] denotes the Smith-McMillan degree (defined in Theorem 6.2). 
The second result establishes compactness (Appendix C contains the proof). 

Theorem 6.1. Letd> 0. U + (N,d) is a compact subset ofAi(C+, C NxN ). 



It is straight-forward but tedious to demonstrate that the gain function S i— > 
||Gt(sg, S, Sl)||_oo is a continuous function on U + (2,d). Thus, the matching 
problem on U + ( 2, d) has a solution. The third result on U + ( 2, d) is the Belevitch 
parameterization. 

Belevitch’s Theorem [53] S G U + (2,d) if and only if 

Kp) f(p) 

±/*(p ) T h*(p) \ ’ 

where /*(p) := /(— p) and 
B-l: /(p), g(p), and h(p) are real polynomials, 

B-2: g(p) is strict Hurwitz 5 of degree not exceeding d, 

B-3: g*(p)g{p) = /*(p)/(p) + K(p)h{p) for all p G C. 



S(p) = 



Sll(p) Sl2(p) 


1 


S2l(p) S22(P) 


g{p) 



Belevitch’s Theorem lets us characterize several classes of 2-ports, such as the 
low-pass and high-pass ladders. The low-pass ladders (Figure 11) admit the 
scattering matrix characterization [3, page 121]: 



S2i(p) 



1 

9(p)' 



These scattering matrices (/(p) = 1) form a closed and therefore compact subset 
of U + ( 2, d). Consequently, the matching problem admits a solution over the class 
of low-pass ladders. Figure 17 shows a high-pass ladder. A high-pass ladder 
admits the scattering matrix characterization [3, page 122]: 

p dg 



where dg denotes the degree of the polynomial g{p). The high-pass ladders form 



O 



O- 






-o 



-o 



Figure 17. A high-pass ladder. 



5 The zeros of g(p) lie in the open left half-plane. 
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a closed and therefore compact subset of U + {2,d). Consequently, the matching 
problem admits a solution over the class of high-pass ladders. 

The fourth result on U + (2,d) is the state-space parameterization illustrated 
in Figure 18. The iV-port has a scattering matrix S £ U + (N,d), where d = 
deg SM [.S'(p)] counts the number of inductors and capacitors, The figure shows 
that by pulling all the d reactive elements into the augmented load Sl(p). What’s 
left is an (N + d)-port with has a constant scattering matrix S a called the 
augmented scattering matrix. Then S a models the (N + d)-port as a collection 
of wires, transformers, and gyrators. Consequently, S a is a real, unitary, and 
constant matrix. Thus, S(p) is the image of the augmented load viewed through 
the augmented scattering matrix. Theorem 6.2 gives the precise statement of 
this state-space representation. 




Figure 18. State-space representation of a lumped, lossless IV-port containing 
d reactive elements. 



Theorem 6.2 (State-Space [52, pages 90-93]). Every lumped, lossless, casual, 
time-invariant N-port admits a scattering matrix S(p) and conversely. If S(p) 
has degree d, S(p) admits the following state-space representation: 

S(p) = ?{S a ,S L -,p) := S a ,u + S a>12 S L (p)(I d - S aa2 S L )~ l S a , 21 , 



where the augmented load is 



Sl{p) 



P - 1 T In l 
p + 1 _ 0 



0 

~In c 
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and N^+Nc = d counts the number of inductors and capacitors. The augmented 
scattering matrix is 



S a 



S a , 11 5a, 1 2 1 N 
S a , 21 S a , 22 J d 



N d 



is a constant , real, orthogonal matrix. 



This representation reveals the structure of the lumped, lossless TV-ports, offers 
a numerically efficient parameterization of U + (N,d) in terms of the orthogonal 
group, proves the Circuit-Scattering Correspondence, generalizes to lumped, pas- 
sive .ZV-ports, and provides an approach to non-lumped or distributed IV-ports. 

A natural generalization drops the constraint on the number of reactive el- 
ements in the 2-port and asks: What is the matching set that is obtained as 
deg SM [5(p)] -* oo? Define 



U + ( 2,oo) = |J U+(2,d). 

d>0 

The physical meaning of U + ( 2,oo) is that it contains the scattering matrices 
of all lumped, lossless 2-ports. It is worthwhile to ask: Has the closure has 
picked up additional circuits? Mathematically, a lossless matching TV-port has a 
scattering matrix S(p) that is a real inner function. Inner functions exhibit a fas- 
cinating behavior at the boundary. For example, inner functions can interpolate 
a sequence of closed, connected subsets K. m C D [12]: lim r ^i S{re^ m ) = K m . 
In contrast to this boundary behavior, if the lossless fV-port is lumped, then S 
is rational and so must continuous. The converse is true and demonstrated in 
Appendix A. 

Corollary 6.1. Let S £ H°°{ C + ,C NxN ) be an inner function. The following 
are equivalent : 

(A): S £A 1 (C+,C NxN ). 

(b): S is rational 

Corollary 6.1 answers our question above with the negative: 

U + ( 2,oo) = |J U + (2, d). 

d> 0 

Thus, continuity forces S £ U + { 2, oo) to be rational and the corresponding 
lossless 2-port to be lumped. It is natural to ask: What lossless 2-ports are not 
in Z7 + (2, oo) ? 

Example 6.1 (Transmission Line). A uniform, lossless transmission line of 
characteristic impedance Z c and commensurate length l is called a unit element 
(UE) with chain matrix [3, Equation 8.1] 



Vi 




cosh(rp) 


Z c sinh(rp) 


V2 


h 




Y c sinh(rp) 


cosh(rp) 


. 
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where r is the commensurate one-way delay t = l/c determined by the speed of 
propagation c. 



h 

+o > 

Vl 






h 

< o + 

v 2 



- o- 



-o - 



Figure 19. The unit element (UE) transmission line. 



The scattering matrix of the transmission line normalized to Z c is 



-S'ue(p) 



0 e~ Tp 
e~ Tp 0 



and gives rise to two observations: First, Sue(jw) oscillates out to ±oo, so 
Sue ( j w) cannot be continuous across ±oo. Thus, U + { 2,oo) cannot contain such 
a transmission line. Second, a physical transmission line cannot behave like this 
near ±oo. Many electrical engineering books mention only in passing that their 
models are applicable only for a given frequency band. One rarely sees much 
discussion that the models for the inductor and capacitor are essentially low- 
frequency models. This holds true even for the standard model of wire. One 
cannot shine a light in one end of a 100-foot length of copper wire and expect 
much out of the other end. These model limitations notwithstanding, the circuit- 
scattering correspondence will be developed using these standard models. The 
transmission line on the disk is 



Sue ° c 1 ( z ) 



0 



exp 





0 



and is recognizable as the simplest singular inner function [35, pages 66-67] 
analytic on C \ {1} [35, pages 68-69]. Figure 20 shows the essential singularity 
of the real part of the (1,2) element of Sue ° c ~ X {z) as z tends toward the 
boundary of the unit circle. 



7. Orbits and Tight Bounds for Matching 

The following equalities convert a 2-port problem into a 1-port problem. Let 
11 be a subset of U + { 2). Let 



Tr(U, s L ) := (Ti(S, s L ) : S G 11}, J 2 (U, s G ) := {T 2 (S, a G ) ■ S G 11} 
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0 (deg) 



Figure 20. Behavior of Re[SuE,i 2 ° c x (z)] for z — re* 6 as r — > 1. 



denote the orbit of the load and the orbit of the generator, respectively. By 
Lemma 4.4, 

su P {||Gt(sg,S,s l )||_ 00 : S G It} = 1 - inf{||AP(a G ,S,s i )||^ 0 : S' G It} 

= 1 - inf{|| AP( s g ,si)||^ : «i € 3 r i('U;si,)} 

= 1 - inf{||AP(s 2 , Sl)||to : S 2 G T 2 CU;sg)}, 

or maximizing the gain on U is equivalent to minimizing the power mismatch on 
either orbit. Darlington’s Theorem makes explicit a class of orbits. 

Theorem 7.1 (Darlington [3]). The orbits of zero under the lumped, lossless 
2-ports are equal 

T 2 (t/ + ( 2 ,oo), 0) = 3 r i(/7 + (2, oo), 0) 

and strictly dense in Re P./li(C + ). 

PROOF. Let S G U + { 2,oo). Corollary 6.1 and Belevitclr’s Theorem give that 

S(p) = - [ !\ J h } € Reyt 1 (C + ,C 2x2 ), 

9 L - 1 -/* 
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where (f,g,h) is a Belevitch triple. With sl = 0 and sq = 0, both ,s-| = 
3q (S', 0) = h/g and belong to Re BA t (C + ). However, Corollary 6.1 restricts S to 
be rational so the orbits cannot be all of ReP./li(C + ). By relabeling S with 1 «-> 
2, we get equality between the orbits. To show density, suppose s G Re BAi(C + ). 
Because the rational functions in Re.R/li(C+) are a dense 6 subset, we may 
approximate s(p) by a real rational function: s « h/g G Re -BAi(C-|_)j where 
h(p) and g(jp) may be taken as real polynomials with g{p) strict Hurwitz and for 
all w£l: g(ju)g*{jtu) — h(joj)h*(joj) > 0. By factoring g(p)g*(p) — h(p)h*(p) 
or appealing to the Fejer-Riesz Theorem [46, page 109], we can find a real 
polynomial f(p) such that 

f(p)f*(p) = g(p)g*(p) ~ h{p)h*{p). 

The conditions of Belevitch’s Theorem are met and 

Kp) f(p) 

f*(p) ~h*(p) _ 

is a lossless scattering matrix that represents a lumped, lossless 2-port. That is, 
h(p)/g(p) dilates to a lossless scattering matrix S(p) for which s « sn. Conse- 
quentially, both orbits are dense in Re BAi(C+). □ 

At this point we are in position to obtain a tight bound on matching performance 
in the special case of vanishing generator reflectance, sg — 0. For any given load 
Sl € BH°°{ C+). Lemma 4.6 shows that s 2 1— > || AP(s 2 , Sl)||oo is continuous. 
This continuity, coupled with the density claims of Darlington’s Theorem, gives: 

min{||AP(s 2 ,Sz,)||^ : s 2 G T 2 (f/ + ( 2 , d); 0)} 
inf{||AP(s 2 ,SL)||^ : s 2 G T 2 (C/ + (2, 00 ); 0)} 

inf{||AP(s 2 ,s i )||^ 0 : s 2 G RePAi(C+)} 
inf{|| AP(s 2 , Si)||^c : s 2 G BH°°(C+)}. 

The “max” and the “min” are used because U + (2,d) is compact (Theorem 6.1) 
and Gt is continuous. The last infimum is attained by a minimizer by the Weier- 
strass Theorem using the weak-* compactness of BH °°( C + ) (page 10) and the 
weak-* lower semicontinuity of the power mismatch (Section 4.10). The mini- 
mum can be computed using the Nonlinear Nehari Theorem (See the comments 
following Corollary 5.2 and Corollary 5.3). Thus, the impedance matching prob- 
lem has a computable bound: 



max{GT(0, S, sl) ■ S G U + (2,d)} 

= 1 - 

< 1 - 

Darlington ^ 



< 



1 - 



s(p) = 



dip) 



6 Density claims on unbounded regions can be tricky. However, Lemma 3.3 isometrically 
maps *Ai(C+) = .Ai(D) o c and preserves the rational functions. Therefore, the dense rational 
functions in .A(D) map to a set of rational functions in j4i(C + ) that must be dense. 
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max{GT(0, S, sl) : S G t/ + (2, d)} 

1 - min{||AP(s 2 ,s L )||^ 0 : s 2 G T 2 (P + (2, d); 0)} 

1 — inf{||AP(s 2 , Sl)||to : s 2 G J 2 (t/+(2, oo); 0)} 

1 - inf{||AP(s 2 ,s L )||^ 0 : s 2 G ReP.Ai(C + )} 

Corollary 5.3 _ 

1— min {||AP(s 2 ,sl )||^ 0 : s 2 G PF°°(C + )} (computable). 

The real constraint can be relaxed for real loads sl by Corollary 5.5: 

max{GT(0, S, sl) '■ S G U + (2,d)} 

1 - min{||AP(s 2 ,s L )||^ : s 2 G T 2 (P + (2, d); 0)} 

l-inf{||AP(s 2 ,s L )|£, : s 2 G T 2 (C/+(2, oo); 0)} 
l-inf{||AP(s 2 ,s L )llL : s 2 G ReBA 1 (C+)} 

1 - inf{||AP(s 2 , Sl)||to : s 2 G PAi(C + )} 

Corollary 5.3 _ 

1— min {||AP(s 2 , sl)||^> : s 2 G BH°°(C + )} (computable). 

Finally, the last inequality is actually equality if sl is sufficiently smooth, using 
Theorem 5.5. Rolling it all up, we see that sl G ReP^li(C + ) forces a lot of 
equalities: 

max{GT(0, S, sl) '■ S G t/ + (2, d)} 

1 - min{||AP(s 2 ,s L )||^, : S 2 G T 2 (P + (2, d); 0)} 

1 — inf{||AP(s 2 , Sl)||to : s 2 G T 2 (t/+(2, oo); 0)} 

1 - inf{||AP(s 2 ,s L )||^ 0 : s 2 G ReiMi(C + )} 

1 - inf{||AP(s 2 , Sl)||to : s 2 G PAi(C + )} 

Corollary 5.3 _ 

1— min {||AP(s 2 ,s l )||^ 0 : s 2 G PP°°(C + )} (computable). 

Physically, this tight Nehari bound means that a lossless 2-port can be found 
with smallest possible power mismatch and that there is a sequence of lumped, 
lossless 2-ports that can get arbitrarily close to this bound. Furthermore, this 
bound can be computed from measured data on the load. 

8. Matching an HF Antenna 

Recent measurements were acquired on the forward-mast integrated HF an- 
tenna on the LPD 17, an amphibious transport dock. The problem is match this 



< 

Darlington 
Corollary 5.5 



Theorem 5.5 



< 

Darlington 
Corollary 5.5 



< 



< 

Darlington 

< 
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antenna over 9-30 MHz to a 50-ohm line impedance using the simplest match- 
ing circuit possible. The goal is to find a simple matching circuit that gets the 
smallest power mismatch or the smallest VSWR (Section 4.8) Thus, a practical 
matching problem is complicated by not only minimizing the VSWR but making 
a tradeoff between VSWR and circuit complexity. 

We start with a transformer, consider low- and high-pass ladders, and then 
show how the Nehari bound benchmarks these matching efforts. The transformer 
has chain and chain scattering matrices parameterized by its turns ratio n (see 
[3, Eq. 2.4] and [25, Table 6.2]; see also Figure 4 and Equation 4-1): 

1 + n 2 1 — n 2 
1 — n 2 1 + n 2 

Figure 21 displays the power mismatch as a function of the turns ratio n. This 
optimal n produced Figure 5 in the introduction. The antenna’s load sl is 
plotted as the solid curve in the unit disk. The solid disk corresponds to those 
reflectances with VSWR less than 4. The dotted line plots the reflectance looking 
to Port 1 of the optimal transformer with Port 2 terminated in the antenna: 
Si = Si (©transformer, sl)- Lemma 4.4 demonstrates that matching at either port 
is equivalent when the 2-port is lossless. 



^transformer — 



n 



-1 



0 

n 



©transformer 

2 n 



Ipdl7fwd4_2; Matching by ideal transformer 




Figure 21. Power mismatch of an ideal transformer. 
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-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1 



VSWR:5.952— > 3.469; II G 7 JI_ m =0.4926-> 0.6948 

Figure 22. The antenna's reflectance sl (solid) and the reflectance si after 
matching with a low-pass ladder of order 4. 

Figure 22 matches the antenna with a low-pass ladder of order 4 (See Fig- 
ure 11). Comparison with the transformer shows little is to be gained with the 
extra complexity. So it is very tempting to try longer ladders, or switch to high- 
pass ladders, or just start throwing circuits at the antenna. The first step to gain 
control over the matching processes is conduct a search over all lumped, lossless 
2-port of degree not exceeding d: 

d h-> min{ || AP(T 2 (S, sg), s l ) || oo : S G U + (2,d)}. 

The state-space representation of Theorem 6.2 provides a numerically efficient 
parameterization of these lossless 2-ports. Figure 23 reports on matching from 
17 + (2,4). What is interesting is that S 2 is starting to take a circular shape. This 
circular shape is no accident. Mathematically, Nehari’s Theorem implies that 
the error is constant at optimum s 2 : 

A P(s 2 (ju),s L (juj)) = p min . 

The electrical engineers know the practical manifestation of Nehari’s Theorem. 
For example, a broadband matching technique is described as follows [55]: The 
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^ L =lpdl7fwd4_2 matched by S(p)e {/"( 2,4) 




VSWR:5.952-> 2.814; II G 7 JI_ oo =0.4926-> 0.7738 



Figure 23. The antenna's reflectance sl (solid) and the reflectance si after 
matching over U + ( 2,4). 



load impedance zl is plotted in the Smith chart. The engineer is to terminate 
this load with a cascade of lossless two-ports. By repeatedly applying “shunt- 
stub/series-line cascades, a skilled designer using simulation software can see 
[the terminated impedance zt } form into a fairly tight circle around z = 1.” The 
appearance of a circle is a real-world demonstration that Nelrari’s Theorem is 
lreuristically understood by microwave engineers. 

The final step for bounding the matching process is to estimate the Nelrari 
bound. Combine the eigenvalue test of Corollary 5.2 with the characterization 
of the power mismatch disks in Lemma 4.5: There is an s 2 € BH°°( C+) with 



|A(s 2 ,Sl)||oo < P 



where the center and radius functions are 



1 — 2 

Cp = kp O c , kp = s L x - p 2 \s l \ 2 ’ 



R P = r p o c 



r P = P 



1 - p 2 |s L | 2 ' 
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Ipdl7fwd4_2: MinEig[ T 2 - H H J 




Figure 24. Estimate of Ai n f(p) versus p in terms of the VSWR 



Let Aj n f (p) denote the smallest real number in the spectrum of 2 — “Kcp^c ■ 
Figure 24 plots an estimate of A in f(p). The optimal VSWR occurs near the 
zero-crossing point. 

Figure 25 uses these VSWR bounds to benchmark several classes of matching 
circuits. Each circuit’s VSWR is plotted as a function of the degree d (the total 
number of inductors and capacitors). The dashed lines are the VSWR from the 
low- and high-pass ladders containing inductors and capacitors constrained to 
practical design values. The solid line is the matching estimated from f/ + (2, d). 
A transformer performs as well as any matching circuit of degree 0 and as well 
as the low-pass ladders out to degree 6. The high-pass ladders get closer to 
the VSWR bound at degree 4. A perfectly coupled transformer (coefficient of 
coupling k = 1) offers only a slight improvement over the transformer. In terms 
of making the tradeoff between VSWR and circuit complexity, Figure 25 directs 
the circuit designer’s attention to the d = 2 region. There exist matching circuits 
of order 2 with performance comparable to high-pass ladders of order 4. Thus, 
the circuit designer can graphically assess trade-offs between various circuits in 
the context of knowing the best match possible for any lossless 2-port. 
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— if (2,0) 

low-pass ladder 
^ high-pass ladder 




3 4 5 

Degree d 

Figure 25. Comparing the matching performance of several classes of 2-ports 
with the Nehari and U + (2,d) bounds. 



9. Research Topics 



This paper shows how to apply the Nehari bound to measured, real-world 
impedances. The price of admission is learning the scattering formalism and a 
few common electric circuits. The payoff is that many substantial research topics 
can be tastefully guided by this concrete problem. For immediate applications, 
several active and passive devices explicitly use wideband matching to improve 
performance: 

• antenna [49; 2; 8; 1]; 

• circulator [36]; 

• fiber-optic links [7; 26; 23]; 

• satellite links [40]; 

• amplifiers [11; 22; 37]. 

The R°° applications to the transducers, antenna, and communication links 
are immediate. The amplifier is an active 2-port that requires a more general 
approach. The matching problem for the amplifier is to find input and output 
matching 2-ports that simultaneously maximize transducer power gain, minimize 
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the noise figure, and maintain stability. Although a more general problem, this 
amplifier-matching problem fits squarely in the H°° framework [28; 29; 30] and 
is a current topic in ONR’s H°° Research Initiative [41]. 

9.1. Darlington’s Theorem and orbits. Parameterizing the orbits currently 
limit the H°° approach and leads to a series of generalization on Darlington’s 
Theorem. An immediate application of Nelrari’s Theorem asks for a “unit-ball” 
characterization of an orbit: 

Question 9.1. For what sq € BH°°{ C + ) is it true that 3q(t/ + (2, oo), sq) is 
dense in Re iMi(C + )? 

This question of characterization is subsumed by the problem of computing or- 
bits: 

Question 9.2. What is the orbit of a general reflectance Ti(R, s^)? 

We can also generalize t/ + (2,oo) and ask about the orbit of Sl over all lumped 
2-ports. 

Question 9.3. Characterize all reflectances that belong to 



\Jj 1 (U+(2,d),s L ) 

d> 0 

Closely related is the question of compatible impedances or when a reflectance 
sl belongs to the orbit of another reflectance s' L . 

Question 9.4. Let sl , s' L £ BH°°( C + ). Determine if there exists an S' £ 
U + { 2) such that Sl = Ti(S",s^). 

The theory of compatible impedances is an active research topic in electrical 
engineering [54] and has links to the Buerling-Lax Theorem [29]. 

9.2. U + ( 2) and circuits. The Circuit-Scattering Correspondence of Section 6 
identified lumped, lossless IV-ports and the scattering matrices of U + ( N , d) [52] . 
By identifying an iV-port as a subset of a Hilbert space, Section 1 claimed 
that any linear, lossless, time-invariant, causal, maximal solvable IV-port cor- 
responded to a scattering matrix in U + (N) [31]. The problem is reconcile the 
lumped approach, which has a concrete representation of a circuit, with Hilbert 
space claim, which gets a scattering matrix — not a circuit - by operator theory. 

Question 9.5. Does every element in U + ( 2) correspond to a lossless 2-port? 

In terms of Kirkoff’s current and voltage laws, if you were handed a collection 
of integro-differential partial differential equations, is it obvious that the system 
admits a scattering matrix? 
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9.3. Circuit synthesis and matrix dilations. If matching problem with 

sg — 0 

inf{||AP( 82 ,* i )|& : s 2 G J 2 (i7 + (2);0)}, 
admits a minimizer, then 

S2 = ?2 {S, sq = 0) = S 22 + s 2 isg(1 — SuSg) 1 Si2|s G =0 = 522- 

How can we use s 2 to get a matching scattering matrix S € U + ( 2)1 Thus, a 
circuit synthesis problem is really a question in matrix dilations. 

Question 9.6. Given s 2 G BH°°( C+), find all S G I/ + ( 2) such that 



s = 


S 11 


s 12 




Sll 


512 




521 


S22 




. S 21 


S 2 



Not all s 2 ’s can dilate to a lossless 2-port. Wohlers [52, page 100-101] shows that 
the 1-port with impedance z(p) = arctan(p) cannot dilate to an S G U + ( 2). The 
Douglas-Helton result characterizes those elements in the unit ball of H°° that 
came from a lossless IV-port. 



Theorem 9.1 ([14; 15]). Let S(p) G BH°°( C+,C NxN ) be a real matrix func- 
tion. The following are equivalent : 



(a) : S(p) admits an real inner dilation S (p) = g 12 ^n i • 

L ‘-’21 (jp) J 22 (p) J 

(b) : S(p) has a meromorphic pseudocontinuation of bounded type to the open left 
half-plane C_; that is, there exist 4 > G H°°{ C_) and H € H°°(C-,C NxN ) 
such that 



H 



lim S(a + juS) = lim —(—& + ju>) a.e. 



(j — ^0 



a —>0 



(c): There is an inner function </> G H°°( C+) such that(j)S H G H°° (<C+,<C NxN ). 



Let M denote the subset of BH°°( C+) of functions that have meromorphic 
pseudocontinuations of bounded type. General hyperbolic Carleson- Jacob (The- 
orem 5.3) line of inquiry opens up to explore when the inequality 

inf{||AP(s 2 , Sl)||oo : s 2 G M} > min{||AP(s 2 , Sl)||^, : s 2 G BH°°( C+)} 



holds with equality. 

9.4. Structure of U + ( 2). Turning to the inclusion t/ + (2,oo) C U + ( 2), the 
preceding sections have established that U + { 2, oo) is a closed subset of U + { 2) 
that consists of all rational inner functions parameterized by Belevitclr’s Theo- 
rem. Physically, 17 + (2,oo) models all the lumped 2-ports, but does not model 
the transmission line. It is natural to wonder what subclass of U + ( 2) contains 
the lumped 2-ports and the transmission line. More precisely, 



Question 9.7. What constitutes a lumped-distributed network? How do we 
recognize its scattering matrix? 
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Wohlers [52] answers the first question by parameterizing the class of lumped- 
distributed iV-ports, consisting of N L inductors, Nc capacitors, and Ny uniform 
transmission lines using the model in Figure 26. Wohlers [52, pages 168-172] 



O- 

Port 1 
O- 

O- 

Port 2 

O 



S(p) 



O 

Port N 

O 




$l(P ) 



Figure 26. State-space representation of a lumped-distributed lossless 2-port. 

establishes that such scattering matrices exist and have the form, 

S(p) = ?(S a ,S L ;p ) = S a , U + S aA2 S L (p)(I d - Sa^SLip^Sa, 2!, 
where the augmented scattering matrix 

c _ S a , 11 S a ,12 

*a,21 *a, 22 

models a network of wires, transformers, and gyrators. Consequently, S a is a 
constant, real, orthogonal matrix of size d = Nl + Nc + 2Nu- Sl(p) is called 
the augmented load and models the reactive elements as 

Sl(p) = q!n l © —qlN c © 

This decomposition assumes: (1) the first Nl + Nq ports are normalized to Zo = 
1, and (2) the remaining Nu pairs of ports are normalized to the characteristic 
impedance of each transmission line. Although some work has be done 

charactering these scattering matrices, the reports in Wohlers [52, page 173] are 
false, as determined by Choi [10]. 
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9.5. Error bounds. The problem is to determine if T r 2 > AC* AC c , when all we 
know are noisy samples of the center and radius functions measured at a finite 
number of frequencies. Of the several approaches to this problem [29], we use 
the simple Spline-FFT Method. 

The Spline-FFT Nehari Algorithm Given samples {(jwk,C(juk)} and 
{{juik, R(ju>k)}< where 0 < uq < u >2 < ■ • • < ojk < oo. 

SF-1: Cayley transform the samples from jM. to the unit circle T; 

c{e? Sk ) := C o c~ 1 (e’ 6k ), r(e? 8k ) := R oc _1 (^ 1 ). 



SF-2: Use a spline to extend {e- 7 ^, c(e J0fe )} and {e J0k ,r(e J0k )j to functions on 
the unit circle T. 

SF-3: Approximate the Fourier coefficients using the FFT: 



N - 1 



?(«;») := 5 E 



N > ‘ 

n '= 0 
- N-l 



-j2irnn'/N(+j2irn' 



( e +j2irn'/N^ 



W") := ivE 



D -j2nnn' /N / +j2nn' 



(e +J ' 2irn ’/ N ) 



n'—O 



SF-4: Make the truncated Toeplitz and Hankel matrices: 



9"r 2 ,M,JV 

“R c ,m,n 



r 2 (N ; mi — m 2 ) 



M - 1 
mi ,m 2=0 



[c[JV;-(mi+m 2 ))] 



M— 1 

mi,m2=0 ' 



SF-5: Find the smallest eigenvalue of 



Am,N ‘J'r 2 ,M,N ~ • 



We are aware of the following sources of error: 

• The samples are corrupted by measurement errors. 

• The spline extensions from sampled data to functions defined on the unit 
circle T. 

• The Fourier coefficients are computed from an FFT of size N. 

• The operator A is computed from M x M truncations. 

Question 9.8. Are these all the sources of error (neglecting roundoff)? How 
can the Spline-FFT Nehari algorithm adapt to account for these errors? Can we 
put error bars on Figure 24? 
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10. Epilogue 

One of the great joys in applied mathematics is to link an abstract compu- 
tation to a physical system. Nehari’s Theorem computes the norm of a Hankel 
operator as the distance between its symbol ^ € L°° and the Hardy subspace 
H°°: 

||5C 0 ||= inf{||0 -hWoc-.heH 00 }. 

One of J. W. Helton’s inspired observations linked this computation to a host of 
problems in electrical engineering and control theory. These problems, in turn, 
led Helton to deep and original extensions of operator theory, geometry, convex 
analysis, and optimization theory. 

By linking H°° theory to the matching circuits, a physical meaning is attached 
to the Nelrari computation and produces a plot that the electrical engineers can 
actually use. Along the way we encountered Darlington’s Theorem, Belevitch’s 
Theorem, Weierstrass’ Theorem, the Carleson- Jacobs theorems, Nehari’s The- 
orem, inner-function models, and hyperbolic geometry. Impedance-matching 
provides a case study of rather surprising mathematical richness in what may 
appear at first to be a rather prosaic analog signal processing issue. 

A measure of the vitality of a subject is the quality of the unexplored ques- 
tions. A small effort invested in circuit theory opens up a host of wonderful 
research topics for mathematicians. These topics discussed in this paper indicate 
only a few of the significant research opportunities that lie between mathematics 
and electrical engineering. For the mathematician, there are few engineering 
subjects where an advanced topic like H°° has such an immediate connection 
actual physical devices. We hope our readers do realize a rich harvest from these 
research opportunities. 

Appendix A. Matrix- Valued Factorizations 

This appendix proves Corollary 6.1 using Blasclrke-Potapov factorizations. 
We start with the scalar- valued case. 

Lemma A.l. Let h € H°°( D) be an inner function. The following are equiva- 
lent : 

(A): h e A(D). 

(b): h is rational. 

PROOF. (a=^b) Factor h as h = cbs , where c G T, b is a Blasclrke and s is a 
singular inner function. If z a G T is an accumulation point of the zeros {z n } of 
6, that is, there is a subsequence z Uk — > z a , then continuity of h on D implies 
that 0 = h(z nk ) — > h(z a ). Continuity of h on D gives a neighborhood U C T 
of z a for which \h(U)\ < 1. Thus, h cannot be inner with b an infinite Blasclrke 
product. Thus, b can only be a finite product and has no accumulation points to 
cancel the discontinuities of s. More formally, b never vanishes on T and neither 
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s nor \a\ is continously extendable to from the interior of the disk to any point 
in the support of the singular measure that represents s [35, pages 68-69]. Thus, 
h cannot have a singular part and we have h = cb. 

(b=^a) A rational h also in cannot have a pole in D. Then h is 

continuous on D so belongs to the disk algebra. □ 



The result generalizes to matrix- valued inner functions. For a £ D, define the 
elementary Blasclrke factor [38, Equation 4.2]: 



b a (z) 



|a| a — z 
a 1 — az 

z 



if a ^ 0, 



if a = 0. 



To get a matrix- valued version, let P € C NxN be an orthogonal projection: 
P 2 = P and P H = P. The Blaschke-Potapov elementary factor associated with 
a and P is [38, Equation 4.4]: 



B a ,p(z) := I M + ( b a (z ) - 1 )P. 



There are a couple of ways to see that B a p is inner. Let U be a unitary matrix 
that diagonalizes P: 

Ik 0 



U H PU = 



0 0 



Then, 



U H B a , P {z)U = 



b a (z)I K 0 
0 Im-k 



From this, we get [38, Equation 4.5]: 

det[B a , P (z)} = & a (^) rank[P1 . 



Definition A.l ([38, pages 320-321]). The function B : D — > <c NxN j s called 
a left Blaschke-Potapov product if either B is a constant unitary matrix or there 
exists a unitary matrix U, a sequence of orthogonal projection matrices { P ^ : 
k G 3C}, and a sequence {zk : k G 3C} C D such that 



5^(1 - | 2 fc|)trace[Pfc] < oo 
kex 



and the representation 



B(z) 



n 



B 



Z k ,P, 



(z) 



U 



holds. 



Definition A. 2 ([38, pages 319]). Let S e H°°( D, C NxN 'j be an inner function. 
S is called singular if and only if detfS^z)] yf 0 for all z € D. 
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Theorem A.l ([38, Theorem 4.1]). Let S € H°°(D, C NxN ) be an inner func- 
tion. There exists a left Blaschke-Potapov product and a C NxN -valued singular 
inner function 5 such that 

S=BE. 

Moreover, the representation is unique up to a unitary matrix U. If 

S = BfEi = 

then B 2 = BiU and S 2 = U H E\. 



Critical for our use is that the determinant maps these matrix-valued generaliza- 
tions of the Blaschke and singular functions to their scalar-valued counterparts. 

Theorem A. 2 ([38, Theorem 4.2]). Let S G C NxN ). 

(A): det[S] G BH°°(D). 

(b) : S is inner if and only z/det[5] is inner. 

(c) : S is singular if and only if det[S] is singular. 

With these results in place, Lemma A.l generalizes to the matrix-valued case. 



Proof of Corollary A.l. (a=4*b) Lemma 3.3 and Assumption (a) give 
that W = S o c -1 is a continuous inner function in A(D,C 2x2 ). Theorem A.l 
gives that W = BE for a left Blaschke-Potapov product B and singular H. 
Observe that det[W] = clet[B] det[H]. If W is inner, then det[W] is inner by 
Theorem A. 2(a). Because W is continuous, det[W] is continuous and Lemma A.l 
forces det[W] to be rational. Therefore, det[W] cannot admit the singular factor 
det[S]. Consequently, W cannot have a singular factor by Theorem A. 2(c). 
Because det [IT] is rational and 

det[W] = det[B] = &™ nklPfcl , 



we see that B must be a finite left Blaschke-Potapov product. 
S = W o c is rational. Finally, this gives that S is rational. 
(b=4>a) Let 



S(P) 



1 

9(p) 



h (p), 



Consequently, 



where g{p) is a real polynomial 



g(p) = 9o + giP H ^9lP L , 



of degree K that is strict Hurwitz (zero only in C_) and H(p) is a real N x N 
polynomial 

H(p) = H 0 + H lP +---+H M p M 
of degree L. Boundedness forces L > M. Then, 

H(p) _ H 0 + • • • + H M p M p—*oo f 0 if L>M, 

g{p) ffoH 9lP l \ H N /g N if L = M. 
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Thus, H(p) / g{p) is continuous across p = ±j oo. Thus, S(p) is continuous at 
±joo. □ 



Appendix B. Proof of Lemma 4.4 

The chain scattering representations are [25]: 



3 (©i;s) :=Si(S, 8), 01 ~ — 

«21 



S(0 2 ;s) :=W,s), 0 2 



Sl 2 



— det[<S] sn 

~~s 22 1 

— det[«S] s 22 

— sn 1 



where denotes equality in homogeneous coordinates: 0 ~ $ if and only if 
3(0) = S($). Because S(j>) is unitary on j'K, 0i(p) and 0 2 (p) are J-unitary on 
jR [29]: 



e H j© = j = 

Fix ui £ K. Define the maps gi and g 2 on the unit disk D as 

gi(s) := 3(01 (jw), s), g 2 (s) := S(02(jw),«). 

Because 0-| (p) and 0 2 (p) are J-unitary on jl R, it follows that gi and g 2 are 
invertible automorphisms of the unit disk onto itself with inverses: 



1 0 
0 -1 



Si = S(0i(jw) \s), 0i (jw) 1 
g 2 X (s) = S(0 2 (jw) _1 ,s), 0 2 (ju;) _1 - 



-1 s n (ju) 

-s 22 {jw) det[S(jw)] 



-1 s 22 {ju) 

-sn(jw) det[S'(jo;)] 

Because the g/ds and their inverses are invertible automorphisms, Equation 4-9 
gives that 



g(si) -g(s 2 ) 




Sl - S 2 


1 - g(si)g(s 2 ) 




1 - Sis^ 



for Sr, s 2 € D and g denoting either gi, g 2 , g 1 , or g 2 . For all p € jR, we 
obtain 

AP(s 2 ,s L ) = 



s 2 - S L 




g 2 (s G ) - S L 


1 - S 2 S L 




1 - g 2 (s G )s L 



sg - g 2 (a L ) 



= AP(s G ,g 2 1 (s L )). 



1 - S G g 2 \s L ) 

Then A P(s 2 ,sl) = AP(s g ,Si), provided we can show si = g 2 (sl)- In terms 
of the chain matrices, this requires us to show 



si = S(0i; sl ) = S(0 2 ; sl ) = S(0 2 ; sl). 
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This equality will follow if we can show 0i ~ © 2 1 or that 



— 1 sn/detfS 1 ] 




— 1 S 22 


— S2 2 /det[5] l/detfS 1 ] 




— sn detfS 1 ] 




Because S(p) is inner, det[5] is inner so that det[S] =1/ detfS 1 ] on jM.. Also, on 
jK, S(p) is unitary so that 



s - 1 - 1 


S 22 


-512 




5ll 


S 12 


det[5] 


-S 21 


5ll 




. S21 


S 22 



Then, s 2 2 = Sn/detfS 1 ] and Sn = s 22 /det[iS]. Thus, ©i ~ 0 2 1 so that Si = 
g^" 1 (sT) or that the LFT law holds. By Lemma 4.3, the LFT laws give the TGP 
laws. 



Appendix C. Proof of Theorem 6.1 

Let C(T, C NxN ) denote the continuous functions on the unit circle T. Let 51 ^ 
denote those rational functions g~ 1 (q)H{q) in C(T,C NxN ) where g(q) and H(q) 
are polynomials with degrees d[g\ < M and d[H] < L. The Existence Theorem 
[9, page 154] shows that 51 ^ is a boundedly compact subset of C( T,C NxN ). 
Lemma 3.3 shows the Cayley transform preserves compactness. Thus, 5tj( f oc is 
a boundedly compact subset of l+C(jR,C NxN ). By Lemma 3.1, U + (N) is a 
closed subset of L°°(jR,C NxN ). The intersection of a closed and bounded set 
with a boundedly compact set is compact. Thus, U + (N) n 3?^ oc is a compact 
subset of l+C(jR,C NxN ). We claim that U + (N,d) = U + (N)d5l% oc. Observe 
51% o c consists of all rational functions with the degree of the numerator and 
denominator not exceeding d and that are also continuous on jR, including the 
point at infinity. If S € U + (N) n 5i% o c,tlren degg^S 1 ] < d. This forces S 
into U + (N,d). Consequently, U + (N,d) D U + (N) n 51% o c. For the converse, 
suppose S € U + (N,d). By Corollary 6.1, S £ di(C + ,C WxiV ) and thus forces S 
into 51% o c. Thus, U + (N, d) C U + (N) D 51% o c and equality must hold. Thus, 
U + (N,d) is compact. 

Appendix D. Proof of Theorem 5.5 

We start by remarking upon the disk with strict inequalities: 

D(c,r) := {(j) G L°°(jM) : \<j>(ju) - c(ju) \ < r(jw) a.e.}. 

First, D(c,r) need not be open. For example, D(0,1) contains the open unit 
ball and is contained in its closure: 



BL°°(jR) c D(0,1) C BL°°(jWL). 
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However, 






UJ 

1 + M 



belongs to D(0, 1) but with ||</>||oo = 1, there is no neighborhood of (f) that is 
contained in the open unit ball. 

Second, consider what the strict inequalities mean for those 7 : L°°(j R) — > ffi. 
that are continuous with sublevel sets 



[7 < a] = D(c a ,r a ). 

We cannot claim that [7 < a] is D(c a ,r a ). Instead, [7 < a ] is an open set 
contained by D(c a , r a ). In this regard, the following result gives us some control 
of the strict inequality. 

Theorem D.l. Let c, r £ L°°(jR). Assume r -1 £ L°°(jR). Let V be any 
nonempty open subset of L°°(jR) such that V C D(c,r). For any <f> £ V, 

\\ r ~ l { ( t > — c )l|oo < 1- 

PROOF. For any (f> £ V, the openness of V implies there is an e > 0 such that 

4 > + £BL°°(jR) C V. 

Consider the particular element of the open ball: 

T 

A</> := e' x sgn (<j> - c) t--— , 

IMloo 

where 0 < e' < e and 

sgn (z):=( Z M if ^°’ 

7 1 0 ifz = 0. 

Then <j) + A <j> £ D(c, r ) so that 

T 

r > \(j) + Atp — cl = \ 4 > — cl + e' T 7 - n — a.e. 

IMloo 

Divide by r and take the norm to get 

1 > ||r -1 (</> — 0)1100 + MMIcx, 1 1 

or that 1 > || r -1 (</> — c)||oo. To complete the argument, we need to demonstrate 
that the preceding argument is not vacuous. That is, D(c, r) does indeed contain 
an open set. Because r does not “pinch off”, 0 < ||r||-oo. Choose any 0 < 77 < 
IMI-oc,. For any G BL°°(jR ) 

||(# + c) - c||oo < 77 < r a.e. 



Thus, the open set c + r/BL^^R) is contained in D(c,r). 



□ 
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Proof of Theorem 5.5. There always holds 

Pba 1 ;= inf{||AP(s2,SL)||oo : s 2 G BAi(C+)} 

> min{||AP(s 2 ,Si,)||oo : S 2 G BH°°( C+)} = Pbh°°- 

Suppose the inequality is strict. Then there is an s 2 G BH°°{ C+) such that 

P BAl > II AP ( s 2,Sl)||oo- (D-l) 

By Lemma 4.6, the mapping A p(s 2 ) := ||AP(s 2 , Sz,) ||oo is a continuous function 
on BL°°(jM.). Consequently, [Ap < p BAl ] is open with 



[ A P < Pba 1 \ c D(k A ,r A ), 
where the center function and radius functions are 



k A ■= s L 



1 ~ P 



BA i 



l -P%A^ I 2 ’ 



. ._ 1-kLl 2 

7 A • P BAl 1 _ 2_ U,| 2 - 



BA i 



Let r A have spectral factorization r A = \q A \. By Theorem D.l, 

II Qa — q A s 2 1| oo < 1- 

If we assume that q A X k A G l+Co(jK), Theorem 5.2 forces equality: 
1 > \\q~ l k A - P 00 (C + )|| 00 = \\q~ l k A - -Ai (C_|_) ||oo - 
The equality lets us select s A G ,Ai(C + ) that satisfies 

1 £o > \\ q a { k A 5.4)1100, 

for some 1 > £0 > 0. This forces the pointwise result: 

(1 - e 0 )r A > \k A - s A \ a.e. 

With some effort, we will show that this pointwise equality implies 



A p{s A ) < Pba !- 

This contradiction implies that Equation D-l cannot be true or that the inequal- 
ity Pba x — Pbh°° cannot be strict. 

To start this demonstration, we first prove q^k A is continuous. Because Sl 
belongs to the open unit ball of the disk algebra, both k A and r A belong to 
1 +Co(jM). Thus, it remains to prove that q jj 1 is continuous. Lemma 3.3 gives 
that Rj 1 = r A o c” 1 belongs to C(T). Ignore the trivial case when p^ A = 0. 
Because 

Ra A P BAl 0- ~ KIU > 0 

it follows that log(P/i) G C(T) and defines the outer function [18, page 24]: 



Qa{z) := exp 
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Lemma 3.3 gives that q a = Qa ° c G ./l 1 (C+) and is also an outer function. 

Because qj\ is an outer function q^ 1 G «4i(C+). Thus, a spectral factorization 

exists in the disk algebra. 

To continue, define for e G [0,£o], 

p(e) := (1 ~e)p BAl . 

Define 

, - 1 ,, 1-NlI 2 

ke ' SL l-p( £ )2| Si |2> r - l_ p(£ )2| Si |2- 

In L°°(j] R), k e — » Ua and r e — > va as e — > 0. Then 

|s/i-fc e | < Isa-^’aI + \kA~k e \ 

< (l-£ 0 )r A + \k A ~k e \ < (l-£ 0 )r e + |r^-r £ | + | /c>i — fe e | . 

Because the last two terms are bounded as 0[e], 

|s./i - k e | < r e - e 0 r e + 0[s]. 

Because ta is uniformly positive, and r e converges to ta, the last two terms are 

uniformly negative for all £ > 0 sufficiently small. This puts 

s a G D(k e ,r e ) <*=*• Ap(s A ) < (1 - £)Pb A i . d 
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Engineering Applications of the Motion-Group 
Fourier Transform 

GREGORY S. CHIRIKJIAN AND YUNFENG WANG 



Abstract. We review a number of engineering problems that can be posed 
or solved using Fourier transforms for the groups of rigid-body motions of 
the plane or three-dimensional space. Mathematically and computationally 
these problems can be divided into two classes: (1) physical problems that 
are described as degenerate diffusions on motion groups; (2) enumeration 
problems in which fast Fourier transforms are used to efficiently compute 
motion-group convolutions. We examine engineering problems including 
the analysis of noise in optical communication systems, the allowable po- 
sitions and orientations reachable with a robot arm, and the statistical 
mechanics of polymer chains. In all of these cases, concepts from non- 
commutative harmonic analysis are put to use in addressing real-world 
problems, thus rendering them tractable. 



1. Introduction 

Noncommutative harmonic analysis is a beautiful and powerful area of pure 
mathematics that has connections to analysis, algebra, geometry, and the the- 
ory of algorithms. Unfortunately, it is also an area that is almost unknown to 
engineers. In our research group, we have addressed a number of seemingly 
intractable “real-world” engineering problems that are easily modeled and/or 
solved using techniques of noncommutative harmonic analysis. In particular, we 
have addressed physical/mechanical problems that are described well as func- 
tions or processes on the rotation and rigid-body- motion groups. The interac- 
tions and evolution of these functions are described using group-theoretic convo- 
lutions and diffusion equations, respectively. In this paper we provide a survey 
of some of these applications and show how computational harmonic analysis on 
motion groups is used. 

The group of rigid-body motions, denoted as SE(7V) (shorthand for “special 
Euclidean” group in TV-dimensional space), is a unimodular semidirect product 
group, and general methods for constructing unitary representations of such Lie 
groups have been known for some time (see [1; 25; 35], for example). In the 
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past 40 years, the representation theory and harmonic analysis for the Euclidean 
groups have been developed in the pure mathematics and mathematical physics 
literature. The study of matrix elements of irreducible unitary representation 
of SE(3) was initiated by N. Vilenkin [39; 40] in 1957 (some particular matrix 
elements are also given in [41]). The most complete study of SE(3) (the universal 
covering group of SE(3)) with application to the harmonic analysis was given by 
W. Miller in [28]. The representations of SE(3) were also studied in [16; 36; 37]. 
In recent works, fast Fourier transforms for SE(2) and SE(3) have been proposed 
[24], and an operational calculus has been constructed [5]. 

However, despite the considerable progress in mathematical developments of 
the representation theory of SE(3), these achievements have not yet been widely 
incorporated in engineering and applied fields. In work summarized here we try 
to fill this gap. A more detailed treatment of numerous applications can be found 
in [6]. 

In Section 2 we review the representation theory of SE(2), give the matrix 
elements of the irreducible unitary representations and review the definition of 
the Fourier transform for SE(2). We also review operational properties of the 
Fourier transform. We do not go into the intricate details of the Fourier transform 
for SE(3), as those are provided in the references described above and they add 
little to the understanding of how to apply noncommutative harmonic analysis 
to real-world problems. Sections 3, 4 and 5 are devoted to application areas: 
coherent optical communications, robotics, and polymer statistical mechanics, 
respectively. 



2. Fourier Analysis of Motion 

In this section we review the basic definitions and properties of the Euclidean 
motion groups. Our emphasis is on the motion group of the plane, but most of 
the concepts extend in a natural way to three-dimensional space. See [6] for a 
complete treatment. 

2.1. Euclidean motion group. The Euclidean motion group, SE (TV), is the 
semidirect product of with the special orthogonal group, SO(IV). We denote 
elements of SE(A) as g = ( a , A) £ SE(IV) where A £ SO (N) and a £ IR W . The 
identity element is e = (0, 1) where I is the N x N identity matrix. For any g = 
(a, A) and h. = (r, R) £ SE(iV), the group law is written as goh = ( a + Ar , AR), 
and g~ x = (— A T a , A T ). Any g = (a, A) £ SE(iV) acts transitively on a position 
x £ M* as 

g ■ x = Ax + a. 

That is, position vector x is rigidly moved by rotation followed by a translation. 

Often in the engineering literature, no distinction is made between a motion, 
g, and the result of that motion acting on the identity element (called a pose 
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or reference frame). Hence, we interchangeably use the words “motion” and 
“frame” when referring to elements of SE(7V). 

It is convenient to think of an element of SE(7V) as an (N + 1) x (TV + 1) 
matrix of the form: 




In the engineering literature, matrices with this kind of structure are called 
homogeneous transforms. 

For example, each element of SE(2) can be parameterized using polar coordi- 
nates as: 

( cos if — sin cj> r cos 6 \ 
sin (f cos (j> r sin 9 , 

0 0 1 ) 

where r > 0 is the magnitude of translation. SE(2) is a 3-dimensional man- 
ifold much like M 3 . We can integrate over SE(2) using the volume element 
d(g(r,9,(j))) = (47r 2 ) _1 rdr d9 d(f>. This volume element is bi-invariant in the 
sense that it does not change under left and right shifts by any fixed element 
h G SE(2): 

d{hog) = d(goh) = d(g). 

Bi-invariant volume elements exist for SE(iV) for N = 2, 3, 4, ... . A group with 
bi- invariant volume element is called a unimodular group. 

The Lie group SE(2) has an associated Lie algebra se(2). Physically, elements 
of SE(2) describe finite motions in the plane, whereas elements of se(2) represent 
infinitesimal motions. Since SE(2) is a three-dimensional Lie group, there are 
three independent directions along which any infinitesimal motion can be de- 
composed. The vector space of all such motions relative to the identity element 
e G SE(2) together with the matrix commutator operation defines se(2). As 
with any vector space, we can choose an appropriate basis. One such basis for 
the Lie algebra se(2) consists of the following three matrices: 

/ 0 0 1 \ / 0 0 0 \ / 0 -1 0 \ 

Au = | 0 0 0 ) ; X 2 0 0 I ] ; X 3 = 1 0 0 . 

\ 0 0 0 / \ 0 0 0 ) \ 0 0 0 J 



The following one-parameter motions are obtained by exponentiating the above 
basis elements of se(2): 



</i(f) = exp (tXi) 



g 2 {t) = exp (tX 2 ) 



10 t \ 

0 10 ; 
0 0 1 / 

1 0 0 \ 

0 1 t ; 

0 0 1 / 
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( COS t — sin t 0 \ 
sin t cos t 0 

0 0 1 / 

For the purposes of the current discussion, we can take as a definition of se( 2) 
the vector space spanned by any linear combination of X\, X 2 , and X3. The 
exponential mapping 

exp : se(2) -4 SE(2) 

is well-defined for every element of se(2) and is invertible except at a set of 
measure zero in SE(2). 

Any rigid-body motion in the plane can be expressed as an appropriate com- 
bination of these three basic motions. For example, g = 51 (z) <72 (y)S3 (</*)• 

2.2. Differential operators on SE(2). The way to take partial derivatives of 
a function of motion is to evaluate 

= j f f{g o exp(fXj))| t=0 , X/7 = ^/(exp(fX;) o g) | t=0 . 

(In our notation, R means that the exponential appears on the right, and L 
means that it appears on the left. This means that Xf is invariant under left 
shifts, while Xf is invariant under right shifts. Our notation is different than 
others in the mathematics literature where the superscript denotes the invariance 
of the vector field formed by the concatenation of these derivatives.) Explicitly, 
we find the differential operators X R in polar coordinates to be [6] 



Xf = cos (<j>-6) — + 



8 sm(<f) — 8) 8 



86 ' 



X 2 = - sin (<j> -6) — + 



8 cos (4> — 6) 8 



dr 



86 ' 



X R = — 

3 8 ( 1 )' 

and in Cartesian coordinates to be 



Xp = cos(j>j- - Xf = sincj)-^— + coscj)^—, Xf = -- 



8 



8 



8 



8 



Q X rv 5 " - A X r\ 1 ^ X O 5 

ox oy ox oy 

The differential operators X R in polar coordinates are 



Xf = cos 8— 

8r 



8 sin 6 8 



r 86' 



Xk = sinfi 1 — 
8r 



8 cos 8 8 



R 



8 



8<j>‘ 



86 ' 



xl d 8 

3 “ d(t) + 86 ' 



2.3. Fourier analysis on SE(2). The Fourier transform, T. of a function of 
motion, f(g) where g £ SE(iV), is an infinite-dimensional matrix defined as [6]: 



3 r U) = fip)= / f{9)U{g-\p)d{g) 
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where U(g,p) is an infinite dimensional matrix function with the property that 
U(g i o g 2 ,p) = U(gi,p)U(g 2 ,p)- This kind of matrix is called a matrix represen- 
tation of SE(AT). It has the property that it converts convolutions on SE(TV) into 
matrix products: 

n/i * h) = ?(/ 2 )?(/i). 

In the case when N = 2, the original function is reconstructed as 

/» OO 

3 r_1 (/) = f(g) = / trace(f{p)U(g,p))pdp, 

J 0 

and the matrix elements of U(g,p) are expressed explicitly as [6]: 

Umn(g(r, e, <f>),p ) = j n - m e -j[ n <l>+(. m - n )9] J n _ m {jp r) 

where J v (x) is the v th order Bessel function and j = \/— 1. This inverse transform 
can be written in terms of elements as 

/»oo 

/(s) = / fmnUnm{g,p)pdp. (2-1) 

m,nGZ 0 

In analogy with the classical Fourier transform, which converts derivatives 
of functions of position into algebraic operations in Fourier space, there are 
operational properties for the motion-group Fourier transform. 

By the definition of the SE(2)-Fourier transform T and operators Xp and X \‘ , 
we can write the Fourier transform of the derivatives of a function of motion as 

nx*f] = u(Xi,p)f{p), J[X^f] = -f(p)u(X u p), 



where 



Explicitly, 

We know that 



and 



Hence, 



u{Xi,p) = ^U(exp(tXi),p) 



t—0 



Umn(exp(tXi),p) = j n m J m _ n (pt). 
Jm (x) ^[Jm— l('I') d m-\- l(*£)] 



Jm— n(0) — 



1 for to — n = 0, 
0 for to — n ^ 0. 



u mn {X uP ) = — u mn (exp(tXi),p) 



t = o 



— 2 T dm,n— l)* 



Likewise, 



Um n (exp(tX 2 ),p) = j n m e ° {n m)n/2 J m - n (pt) = J m -n(pt), 
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and so 



Umn(X 2 ,p) = — Um n (exp(tX 2 ),p) 



t = 0 



— 2 ^ 771 — 7i+l(0)) — 2 (^777,77+1 $m,n— l)* 



Similarly, we find 



u m „(exp(£A 3 ),p) = e jrnt Sm tr 



and 



hmn{^3 5 P) — u m/n ( exp ( t A)j ) , p) 



jrnSm,n- 



Fast Fourier transforms for SE(2) and SE(3) have been outlined in [6; 24]. 
Operational properties for SE(3) which are analogous to those presented here for 
SE(2) can be found in [5; 6]. Subsequent sections in this paper describe various 
applications of motion-group Fourier analysis to problems in engineering. 



3. Phase Noise in Coherent Optical Communications 

In optical communications, laser light is used to transmit information along 
fiber optic cables. There are several methods that are used to transmit and 
detect information within the light. Coherent detection (in contrast to direct 
detection) is a method that has the ability to detect the phase, frequency, ampli- 
tude and polarization of the incident light signal . Therefore, information can be 
transmitted via phase, frequency, amplitude, or polarization modulation. How- 
ever, the phase of the light emitted from a semiconductor laser exhibits random 
fluctuations due to spontaneous emissions in the laser cavity [19]. This phenom- 
enon is commonly referred to as phase noise. Phase noise puts strong limitations 
on the performance of coherent communication systems. Evaluating the influ- 
ence of phase noise is essential in system design and optimization and has been 
studied extensively in the literature [10; 12]. Analytical models that describe the 
relationship between phase noise and the filtered signal are found in [2; 11]. In 
particular, the Fokker-Planck approach represents the most rigorous description 
of phase noise effects [13; 14]. To better apply this approach to system design 
and optimization, an efficient and powerful computational tool is necessary. In 
this section, we describe one such tool that is based on the motion-group Fourier 
transform. Readers unfamiliar with the technical terms used below are referred 
to [21]. The discussion in the following paragraph provides a context for this 
particular engineering application, but the value of noncommutative harmonic 
analysis in this context is solely due to its ability to solve equation (3-1). 

Let s(t) be the input signal to a bandpass filter which is corrupted by phase 
noise. Using the equivalent baseband representation and normalizing it to unit 
amplitude, this signal can be written as [14] 

s(t) = e jcKt) 
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where <j>(t) is the phase noise, usually modeled as a Brownian motion process. 
The function h(t) is the impulse response of the bandpass filter. The output of 
the bandpass filter is denoted z(t). Let us represent z(t) through its real and 
imaginary parts: 

z{t) = x(t) + jy(t ) = r(f)e jS(t) . 

The 3-D Fokker-Planck equation defining the probability density function (pdf) 
of z(t) is derived as [2; 45]: 

I = -MfWg - + f ^ (3-1) 

with initial condition 0) = S(x)S(y)S(c/>), where S being the Dirac delta 

function. The parameter D is related to the laser line width Av by D = 2ttAv. 
Having an efficient method for solving equation (3-1) is of great importance in 
the design of filters. 

A number of papers have attempted to solve the above equations using a 
variety of techniques including series expansions, numerical methods based on 
discretizing the domain, and analytical methods [42; 45]. However, all of them 
are based on classical partial differential equation solution techniques. 

In our work, we present a new method for solving these methods using har- 
monic analysis on groups. These techniques reduce the above Fokker-Planck 
equations to systems of linear ordinary differential equations with constant or 
time- varying coefficients in a generalized Fourier space. The solution to this 
system of equations in generalized Fourier space is simply a matrix exponential 
for the case of constant coefficients. A usable solution is then generated via the 
generalized Fourier inversion formula. 

Using the differential operators defined on the motion group, the 3-D Fokker- 
Planck equation in (3-1) can be rewritten as 

% = (~h(t)X? + y (l 3 fl ) 2 ) /• (3-2) 

This equation describes a kind of process that evolves on the group of rigid- 
body motions SE(2). Applying the motion-group Fourier transform to (3-2), we 
can convert it to an infinite system of linear ordinary differential equations: 



i - *«>/. 

For equation (3-2) , the matrix is 

A{t) = -h(t)u(X 2 ,p) + y( u(X 3 ,p )) 2 

and its elements are 

jp D 

A(t)mn = ^ (^m,n-t-l $m,n— l) ^ m,n • 



(3-3) 
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Numerical methods such as Runge-Kutta integration can be applied to easily 
solve the truncated version of this system. In the case when h(t) is a constant, 
then A is a constant matrix and the solution to the resulting linear time-invariant 
system can be written in closed form as 



f(p ; t ) = exp (At) 



with the initial condition that /(p; 0) is the infinite-dimensional identity matrix. 
In practice we truncate A at finite dimension, then exponentiate. 

Once we get the solution to (3-3), we can then substitute it into the Fourier 
inversion formula for the motion group in (2-1) to recover the pdf f{g\ t ) of z{t). 
To get the pdf /(r, 9\ t) is just an integration with respect to $ as 






1 

27 r 



p 2"7T 



£ / f{g-,m=Y J r n e- ine 



f 0 ,nJ-n(pr)pdp. 



n£Z 



(3-4) 



Integrating equation (3-4) over 0 will give us the marginal pdf of \z(t)\ as: 






fo,o(p)Jo(p r)pdp. 



(3-5) 



Jo 

Using our method, we can get a simple and compact expression for the marginal 
pdf for the output of the bandpass filter given in (3-5) . 

For details and numerical results generated using this approach, see [43]. 



4. Robotics 

A robotic manipulator arm is a device used to position and orient objects in 
space. The set of all reachable positions and orientations is called the workspace 
of the arm. A robot arm that can attain only a finite number of different states 
is called a discretely-actuated manipulator. For such manipulators, it is a com- 
binatorially explosive problem to enumerate by brute force all possible states for 
arms that have a high degree of articulation. The function that describes the 
relative density of reachable positions and orientations in the workspace (called 
a workspace density function) has been shown to be an important quantity in 
planning the motions of these manipulator arms [4]. This function is denoted as 
f(g ; L) where g £ SE(iV), and L is the length of the arm. 

Noncommutative harmonic analysis enters in this problem as a way to reduce 
this complexity. It was shown in [4] that the workspace density function f(g: L i + 
L 2 ) for two concatenated manipulator segments with length Li and L 2 is the 
motion-group convolution 

f(g-L 1 + L 2 ) = f(g-L 1 )*f(g-,L 2 )=[ f{h\L 1 )f{h~ l og\L 2 )dh 1 (4-1) 

Jg 

where h is a dummy variable of integration and dh is the bi-invariant (Haar) mea- 
sure for SE(IV). That is, given two short arms with known workspace densities, 
we can generate the workspace density of the long arm generated by stacking one 
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short arm on the other using equation (4-1). In order to perform these convolu- 
tions efficiently, the concept of FFTs for the motion groups was studied in [6]. 

In the rest of this section, we discuss an alternative method for generating 
manipulator workspace density functions that does not explicitly compute con- 
volutions. Instead, it relies on the same kinds of degenerate diffusions we have 
seen already in the context of phase noise. 

4.1. Inspiration of the algorithm. Consider a discretely-actuated serial 
manipulator which consists of concatenated segments called modules. Suppose 
that each module can reach 16 different states. The workspace of this manipu- 
lator with 2 modules, 3 modules and 4 modules can be generated by brute force 
enumeration because 16 2 , 16 3 , and 16 4 are not terribly huge numbers. It is easy 
to imagine that the size of the workspace will spread out with the increment of 
modules. This enlargement of the workspace is just like the diffusion produced 
by a drop of ink spreading in a cup of water. Inspired by this observation, we 
view the workspace of a manipulator as something that grows/evolves from a 
single point source at the base as the length of the manipulator increases from 
zero. The workspace is generated after the manipulator grows to full length. 

4.2. Implementation of the algorithm. With this analogy, we then need to 
determine what kind of diffusion equation is suitable to model this process. We 
get such an equation by realizing that some characteristics of manipulators are 
similar to those of polymer chains like DNA. 

During our study of conformational statistics in polymer science, we derived a 
diffusion-type equation defined on the motion group [7]. This equation describes 
the probability density function of the position and orientation of the distal 
end of a stiff macromolecule chain relative to its proximal end. By involving 
parameters which indicate the kinematic properties of a manipulator into this 
equation, we can modify it to the diffusion-type equation describing the evolution 
of the workspace density function. It is written explicitly as 

+ (3 (Xf ) 2 + X* + £(X 3 *) 2 ) /. (4-2) 

Here / stands for the workspace density function, and L is the manipulator 
length. The differential operators X f 1 and X^ are those defined on SE(2) given 
earlier. Parameters (3, e and a describe the kinematic properties of manipulators. 
We define these kinematic properties as flexibility, extensibility and the degree 
of asymmetry. The parameter (3 describes the flexibility of a manipulator in the 
sense of how much a segment of the manipulator can bend per unit length. A 
larger value of f3 means that the manipulator can bend a lot. The parameter £ 
describes the extensibility of a manipulator in the sense of how much a manip- 
ulator can extend along its backbone direction. A larger value of £ means that 
the manipulator can extend a lot. The parameter a describes the asymmetry in 
how the manipulator bends. When a = 0, the manipulator can reach left and 




72 



GREGORY S. CHIRIKJIAN AND YUNFENG WANG 



right with equal ease. When a < 0, there is a preference for bending to the left, 
and when a > 0 there is a preference for bending to the right. Since a, f3, and 
e are qualitative descriptions of the kinematic properties of a manipulator, they 
are not directly measurable. 

This simple three-parameter model qualitatively captures the behavior that 
has been observed in numerical simulations of workspace densities of discretely- 
actuated variable-geometry truss manipulators [23] . Clearly, equation (4-2) can 
be solved in the same way as the phase-noise equation. We have done this in [43]. 

5. Statistical Mechanics of Macromolecules 

In this section, we show how certain quantities of interest in polymer physics 
can be generated numerically using Euclidean-group convolutions. We also show 
how for wormlike polymer chains, a partial differential equation governs a pro- 
cess that evolves on the motion group and describes the diffusion of end-to-end 
position and orientation. This equation can be solved using the SE(3)-Fourier 
transform in a manner very similar to the way the phase-noise Fokker-Planck 
was addressed in Section 3. This builds on classical works in polymer theory 
such as [8; 15; 20; 22; 34; 44], 

5.1. Mass density, frame density, and Euclidean group convolutions. 

In statistical mechanical theories of polymer physics, it is essential to compute 
ensemble properties of polymer chains averaged over all of their possible confor- 
mations [9; 27]. Noncommutative harmonic analysis provides a tool for comput- 
ing probability densities used in these averages. 

In this subsection we review three statistical properties of macromolecular 
ensembles. These are: (1) The ensemble mass density for the whole chain p(x), 
which is generated by imagining that one end of the chain is held fixed and a 
cloud is generated by all possible conformations of the chain superimposed on 
each other; (2) The ensemble tip frame density f(g) (where g is the frame of 
reference of the distal end of the chain relative to the fixed proximal end); (3) 
The function /i(g,£c), which is the ensemble mass density of all configurations 
which grow from the identity frame fixed to one end of the chain and terminate 
at the relative frame g at the other end. Figures that describe these quantities 
can be found in [3]. 

The functions p, /, and p are related to each other. Given p(g,x), the en- 
semble mass density is calculated by adding the contribution of each /i for each 
different end position and orientation: 

p{x)= / p(g,x)dg. (5-1) 

Jg 

This integration is written as being over all motions of the end of the chain, but 
only frames g in the support of p, contribute to the integral. Here G is shorthand 
for SE(3) and dg denotes the invariant integration measure for SE(3). 
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In an analogous way, it is not difficult to see that integrating the ^-dependence 
out of p provides the total mass of configurations of the chain starting at frame 
e and terminating at frame g. Since each chain has mass M, this means that 
the frame density f(g) is related to p(g, x) as: 

fid) = jj f n{g,x)dx. (5-2) 

M Jr 3 

We note the total number of frames attained by one end of the chain relative 
to the other is 

F= f f(g) dg. 

JG 

It then follows that 

/ p{x)dx = F ■ M. 

Jr 3 

If the functions p{x) and f(g) are known for the whole chain then a number 
of important thermodynamic and mechanical properties of the polymer can be 
determined [6]. 

We can divide the chain into P segments that are short enough to allow brute 
force enumeration calculation of Pi{x) and fi(g) for i = 1, . . . , P, where g is the 
relative frame of reference of the distal end of the segment with respect to the 
proximal one. For a homogeneous chain, such as polyethylene, these functions 
are the same for each value of i = 1 , ,P. 

In the general case of a heterogeneous chain, we can calculate the functions 
Pi,i+i{ x )i fi,i+i(g)> and hi,i+i(g, x ) for the concatenation of segments i and i + 1 
from those of segments i and i + 1 separately in the following way: 

Pi,i+ i{x) = F i+ ipi{x) + / fi{h)p i+ -i_(h~ l o x) dh , (5-3) 

JG 

fi,i+i(g) = ( fi * fi+i){g) = [ fiWfi+iih^ 1 o g) dh. (5-4) 

JG 

and 

p i>i+1 (g,x) = / (p i (h,x)fi + i(h~ 1 og) + f l (h)p i+1 (h~ 1 og,h^ 1 ox)) dh. 

JG 

(5-5) 

In these expressions h € G = SE(3) is a dummy variable of integration. 
The meaning of equation (5-3) is that the mass density of the ensemble of all 
conformations of two concatenated chain segments results from two contribu- 
tions. The first is the mass density of all the conformations of the lower seg- 
ment (weighted by the number of different upper segments it can carry, which 
is Fi + 1 = f G |_i dg). The second contribution results from rotating and trans- 
lating the mass density of the ensemble of the upper segment, and adding the 
contribution at each of these poses (positions and orientations). This contribu- 
tion is weighted by the number of frames that the distal end of the lower segment 
can attain relative to its base. Mathematically L(h) pi+\(x) = pi + i(h~ 1 o x) is 
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a left-shift operation which geometrically has the significance of rigidly trans- 
lating and rotating the function p i+ \{x) by the transformation h. The weight 
fj ( h) dh is the number of configurations of the i th segment terminating at frame 
of reference h. 

The meaning of equation (5-4) is that the distribution of frames of reference 
at the terminal end of the concatenation of segments i and i + 1 is the group- 
theoretical convolution of the frame densities of the terminal ends of each of the 
two segments relative to their respective bases. This equation holds for exactly 
the same reason why equation (4-1) does in the context of robot arms. 

Equation (5-5) says that there are two contributions to pi t i+i(g,x). The first 
comes from adding up all the contributions due to each pi(h, x). This is weighted 
by the number of upper segment conformations with distal ends that reach the 
frame g given that their base is at frame h. The second comes from adding 
up all shifted (translated and rotated) copies of pi + i (g,x), where the shifting is 
performed by the lower distribution, and the sum is weighted by the number of 
distinct configurations of the lower segment that terminate at h. This number 
is fi(h) dh. 

Equations (5-3), (5-4) and (5-5) can be iterated as described in [3; 6]. 

5.2. Statistics of stiff molecules as solutions to PDEs on SO(3) and 
SE(3). Experimental measurements of the stiffness constants of DNA and other 
stiff (or semi-flexible) macromolecules have been reported in a number of papers, 
as well as the statistical mechanics of such molecules. See [17; 26; 29; 30; 31; 32; 
33; 38], for example. 

The stiffness and chirality (how helical the molecule is) can be described with 
parameters Di *. and d; for l,k = 1,2,3. In particular, Du- are the elements 
of the inverse of the stiffness matrix. When a force is applied, these constants 
determine how easily one end of the molecule deflects from the helical shape that 
it assumes when no forces act on it. The parameters di describe the helical shape 
of an undeformed molecule with flexibility described by Du~. These parameters 
are described in detail in [7]. 

Degenerate diffusion equations describing the evolution of position and orien- 
tation of frames of reference attached to points on the chain at different values 
of length, L , have been derived [6; 43]. These equations incorporate stiffness and 
chirality information and are written in terms of SE(3) differential operators as 

- \ E Di**?** - E d '^ + / = °- ( 5 ~ 6 ) 

' Z k,l=l i=i ' 



The initial conditions are /(a, A; 0) = d(a)d(A) where g = (a, A). 

This equation has been solved using the operational properties of the SE(3) 
Fourier transform in [5; 6; 43]. 
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6. Conclusions 

This paper has reviewed a number of applications of harmonic analysis on 
the motion groups. This illustrates the power of noncommutative harmonic 
analysis, and its potential as a computational and analytical tool for solving 
real-world problems. We hope that this review will stimulate interest among 
others working in the field of noncommutative harmonic analysis to apply these 
methods to problems in engineering, and we hope that those in the engineering 
sciences will appreciate noncommutative harmonic analysis for the powerful tool 
that it is. 
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Three-Dimensional Data 
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Abstract. Three-dimensional volumetric data are becoming increasingly 
available in a wide range of scientific and technical disciplines. With the 
right tools, we can expect such data to yield valuable insights about many 
important phenomena in our three-dimensional world. 

In this paper, we develop tools for the analysis of 3-D data which may 
contain structures built from lines, line segments, and filaments. These 
tools come in two main forms: (a) Monoscale: the X-ray transform, offering 
the collection of line integrals along a wide range of lines running through 
the image — at all different orientations and positions; and (b) Multiscale: 
the (3-D) beamlet transform, offering the collection of line integrals along 
line segments which, in addition to ranging through a wide collection of 
locations and positions, also occupy a wide range of scales. 

We describe different strategies for computing these transforms and sev- 
eral basic applications, for example in finding faint structures buried in 
noisy data. 



1. Introduction 

In field after field, we are currently seeing new initiatives aimed at gathering 
large high-resolution three-dimensional datasets. While three-dimensional data 
have always been crucial to understanding the physical world we live in, this 
transition to ubiquitous 3-D data gathering seems novel. The driving force is 
undoubtedly the pervasive influence of increasing storage capacity and computer 
processing power, which affects our ability to create new 3-D measurement in- 
struments, but which also makes it possible to analyze the massive volumes of 
data that inevitably result when 3-D data are being gathered. 



Keywords: 3-D volumetric (raster-scan) data, 3-D x-ray transform, 3-D beamlet transform, 
line segment extraction, curve extraction, object extraction, linogram, slant stack, shearing, 
planogram. 
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As examples of such ongoing developments we can mention: Extragalactic 
Astronomy [50] , where large-scale galaxy catalogs are being developed; Biological 
Imaging, where methods like single-particle electron microscopy and tomographic 
electron microscopy directly give 3-D data about structure of biological interest 
at the cellular level and below[45; 26]; and Experimental Particle Physics, where 
3-D detectors lead to new types of experiments and new data analysis questions 
[ 22 ]. 

In this paper we describe tools which will be helpful for analyzing 3-D data 
when the features of interest are concentrated on lines, line segments, curves, 
and filaments. Such features can be contrasted to datasets where the objects 
of interest might be blobs or pointlike objects, or where the objects of interest 
might be sheets or planar objects. Effectively, we are classifying objects by 
their dimensionality; and for this paper the underlying objects of interest are of 
dimension 1 in R 3 . 




Figure 1 . A simulated large-scale galaxy distribution. (Courtesy of Anatoly 
Klypin.) 



1.1. Background motivation. As an example where such concerns arise, 
consider an exciting current development in extragalactic astronomy: the com- 
pilation and publication of the Sloan Digital Sky Survey, a catalog of galaxies 
which spans an order of magnitude greater scale than previous catalogs and 
which contains an order of magnitude more data. 

The catalog is thought to be massive enough and detailed enough to shed 
considerable new light on the processes underlying the formation of matter and 
galaxies. It will be particularly interesting (for us) to better understand the 
filamentary and sheetlike structure in the large-scale galaxy distribution. This 
structure reflects gravitational processes which cause the matter in the universe 
to collapse from an initially fully three-dimensional scatter into a scatter con- 
centrated on lower-dimensional structures [41; 25; 49; 48]. 
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Figure 1 illustrates a point cloud dataset obtained from a simulation of galaxy 
formation. Even cursory visual inspection suggests the presence of filaments and 
perhaps sheets in the distribution of matter. Of course, this is artificial data. 
Similar figures can be prepared for real datasets such as the Las Campanas cat- 
alog, and, in the future, the Sloan Digital Sky Survey. To the eye, the simulated 
and real datasets will look similar. But can one say more? Can one rigorously 
compare the quantitative properties of real and simulated data? Existing tech- 
niques, based on two-point correlation functions, seem to provide only very weak 
ability to discriminate between various point configurations [41; 25]. 

This is a challenging problem, and we expect that it can be attacked using 
the methods suggested in this paper. These methods should be able to quantify 
the extent and nature of filamentary structure in such datasets, and to provide 
invariants to allow detailed comparisons of point clouds. While we do not have 
space to develop such a specific application in detail in this paper, we hope to 
briefly convey here to the reader a sense of the relevance of our methods. 

What we will develop in this paper is a set of tools for digital 3-D data 
which implement the X-Ray transform and related transforms. For analysis of 
continuum functions f(x,y,z) with (. x,y,z ) £ R' 5 , the X-ray transform takes 
the form 

(Xf)(L) = J f(p)dp, 

where L is a line in R 3 , and p is a variable indexing points in the line; hence the 
mapping / i— > Xf contains within it all line integrals of /. 

It seems intuitively clear that the X-ray transform and related tools should be 
relevant to the analysis of data containing filamentary structure. For example, it 
seems that in integrating along any line which matches a filament closely over a 
long segment, we will get an unusually large coefficient, while on lines that miss 
filaments we will get small coefficients, and so the spread of coefficients across 
lines may reflect the presence of filaments. 

This sort of intuitive thinking resembles what on a more formal level would be 
called the principle of matched filtering in signal detection theory. That principle 
says that to detect a signal in noisy data, when the signal is at unknown location 
but has a known signature template, we should integrate the noisy data against 
the signature template shifted to all locations where the signal may be residing. 
Now filaments intuitively resemble lines, so integration along lines is a kind of 
intuitive matched filtering for filaments. Once this is said, it becomes clear that 
one wants more than just integrating along lines, because filamentarity can be 
a relatively local property, while lines are global objects. As filaments might 
resemble lines only over moderate-length line segments, one might find it more 
informative to compare them with templates of line integrals over line segments 
at all lengths, locations, and orientations. Such segments may do a better job of 
matching templates built from fragments of the filament. 
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Hence, in addition to the X-ray transform, we also consider in this paper 
a multiscale digital X-ray transform which we call the beamlet transform. As 
defined here, the beamlet transform is designed for data in a digital n x n x n 
array. Its intent is to offer multiscale, multiorientation line integration. 

1.2. Connection to 2-D beamlets. Our point of view is an adaptation to the 
3-D setting of the viewpoint of Donoho and Huo, who in [21] have considered 
beamlet analysis of 2-D images. They have shown that beamlets are connected 
with various image processing problems ranging from curve detection to image 
segmentation. In their classification, there are several levels to 2-D beamlet 
analysis: 

• Beamlet dictionary: a special collection of line segments, deployed across ori- 
entations, locations, and scales in 2-D, to sample these in an efficient and 
complete manner. 

• Beamlet transform: the result of obtaining line integrals of the image along 
all the beamlets. 

• Beamlet graph: a graph structure underlying the 2-D beamlet dictionary which 
expresses notions of adjacency of beamlets. Network flow algorithms can use 
this graph to explore the space of curves in images very efficiently. Multiscale 
chains of 2-D beamlets can be expressed naturally as connected paths in the 
beamlet graph. 

• Beamlet algorithms: algorithms for image processing which exploit the beam- 
let transform and perhaps also the beamlet graph. 

They have built a wide collection of tools to operationalize this type of analysis 
for 2-D images. These are available over the internet [1; 2]. In the BeamLab 
environment, one can, for example, assemble the various components in the 
above picture to extract filaments from noisy data. This involves calculating 
beamlet transforms of the noisy data, using the resulting coefficient pyramid as 
input to processing algorithms which are organized around the beamlet graph 
and which use various graph-theoretical optimization procedures to find paths 
in the beamlet graph which optimize a statistical goodness-of-match criterion. 

Exactly the same classification can be made in three dimensions, and very 
similar libraries of tools and algorithms can be built. Finally, many of the same 
applications from the two-dimensional case are relevant in 3-D. Our goal in this 
paper is to build the very basic components of this picture: describing the X-ray 
and beamlet transforms that we work with, the resulting beamlet pyramids, and 
a few resulting beamlet algorithms that are easy to implement in this framework. 
Unfortunately, in this paper we are unable to explore all the analogous beamlet- 
based algorithms — such as the algorithms for extracting filaments from noisy 
data using slrortest-path and related algorithms in the beamlet graph. We simply 
scratch the surface. 
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1.3. Contents. The contents of the paper are as follows: 

• Section 2 offers a discussion of two different systems of lines in 3-D, one 
system enumerating all line segments connecting pairs of voxel corners on the 
faces of the digital cube, and one system enumerating all possible slopes and 
intercepts. 

• Section 3 discusses the construction of beamlets as a multiscale system based 
on these systems of lines, and some properties of such systems. The most 
important pair of properties being (a) the low cardinality of the system: it 
has 0(n 4 ) elements as opposed to the 0(n 6 ) cardinality of the system of all 
multiscale line segments, while (b) it is possible to express each line segment 
in terms of a short chain of 0(log(n)) beamlets. 

• Section 4 discusses two digital X-ray transform algorithms based on the vertex- 
pairs family of lines. 

• Section 5 discusses transform algorithms based on the slope-intercept family 
of lines. 

• Section 6 exhibits some performance comparisons 

• Section 7 offers some basic examples of X-ray analysis and synthesis. 

• Section 8 discusses directions for future work. 



2. Systems of Lines in 3-D 

To implement a digital X-ray transform one needs to define structured families 
of digital lines. We use two specific systems here, which we call the vertex- 
pair system and the slope-intercept system. Alternative viewpoints on ‘digital 
geometry’ and ‘discrete lines’ are described in [33; 34]. 

2.1. Vertex-pair systems. Take an nxnxn cube of unit volume voxels, and 
call the set of vertices V the voxel corners which are not interior to the cube. 
These vertices occur on the faces on the data cube, and there are about 6(n + l) 2 
such vertices. For an illustration, see Figure 2. 

To keep track of vertices, we label them by the face they belong to 1 < / < 6 
and by the coordinates [k\ . kf within the face. 

Now consider the collection of all line segments generated by taking distinct 
pairs of vertices in V. This includes many ‘global scale lines’ crossing the cube 
from one face to another, at voxel-level resolution. In particular it does not 
contain any line segments with endpoints strictly inside the cube. 

The set has roughly 18?r 4 elements, which can be usefully indexed by the pair 
of faces (/i, / 2 ) they connect and the coordinates [k\, k\], [k\, of the endpoints 
on those faces. There are 15 such face-pairs involving distinct faces, and we can 
uniquely specify a line by picking any such face-pair and any pair of coordinate 
pairs obeying kf € {0, 1, 2, . . . ?r}. 
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Figure 2. The vertices associated with the data cube are the voxel corners on 
the surface; a digital line indicated in red, with endpoints at vertices indicated in 
green. 

2.2. Slope-intercept systems. We now consider a different family of lines, 
defined not by the endpoints, but by a parametrization. For this family, it is best 
to change the origin of the coordinate system so that the data cube becomes an 
n x n x n collection of cubes with center of mass at (0, 0, 0). Hence, for ( x , y , z) 
in the data cube we have |x|, |y|, \z\ < n/2. We can consider three kinds of 
lines: x-driven, y-driven, and z-driven, depending on which axis provides the 
shallowest slopes. An x-driven line takes the form 

z = s z x + t z , y = s y x + ty 

with slopes s z ,s y , and intercepts t z and t y . Here the slopes |s 0 |, |s K | < 1. y- and 
z-driven lines are defined with an interchange of roles between x and y or z, as 
the case may be. 

We will consider the family of lines generated by this, where the slopes and 
intercepts run through an equispaced family: 

Sx ? Sy , s z € {2( /n . t 1}, tx, ty , t z (E {t . ti~\~ 1 , . . . , n 1 } . 

3. Multiscale Systems: Beamlets 

The systems of line segments we have just defined consist of global scale seg- 
ments beginning and ending on faces of the cube. For analysis of fragments of 
lines and curves, it is useful to have access to line segments which begin and 
end well inside the cube and whose length is adjustable so that there are line 
segments of all lengths between voxel scale and global scale. 

A seemingly natural candidate for such a collection is the family of all line 
segments between any voxel corner and any other voxel corner. For later use, we 
call such segments 3-D beams. This set is expressive — it approximates any line 
segment we may be interested in to within less than the diameter of one voxel. 
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On the other hand, the set of all such beams can be of huge cardinality — with 
0(n 3 ) choices for both endpoints, we get 0(n 6 ) 3-D beams — so that it is clearly 
infeasible to use the collection of 3-D beams as a basic data structure even for 
n = 64. Note that digital 3-D imagery is becoming available with n = 2048 from 
Resolution Sciences, Inc., Corte Madera, CA, and many important applications 
involve the analysis of volumetric images that contain filamentary objects such 
as blood vessel networks or fibers in a paper. For such datasets it seems natural 
to use beams-based analysis tools, however, working with 0(n 6 ) storage would 
be prohibitive. 

The challenge, then, is to develop a reduced- cardinality substitute for the 
collection of 3-D beams, but one which is nevertheless expressive, in that it can 
be used for many of the same purposes as 3-D beams. Throughout this section 
we will be working in the context of vertex-pair systems of lines. 

3.1. The beamlet system. A dyadic interval D(j,k) satisfies D(j,k) = 
[k/ 2 J , ( k + 1)/2 J ] C [0, 1] where k is an integer between 0 and 2 J ; it has length 
2 - A A dyadic cube C(ki,k 2 ,k 3 ,j) C [0, l] 3 is the direct product of dyadic 
intervals 



[hi/ 2 j , (fcr + 1)/2 J ] 8 ) [k 2 / V, (k 2 + 1)/2 J ] ® [k 3 / V, (k 3 + l)/2'] 



where 0 < k \ , k 2 , k 3 < 2 J for an integer j > 0. Such cubes can be viewed as de- 
scended from the unit cube (7(0, 0, 0, 0) = [0, l] 3 by recursive partitioning. Hence, 
the splitting (7(0, 0,0,0) in half along each axis D{j,k\) 8) D(j,k 2 ) 8> D(j,k 3 ) 
yields the eight cubes C(k\, k 2 , k 3 , 1) where ki € {0,1}, splitting those in half 
along each axis we get the 64 subcubes C(ki,k 2 , k 3 , 2) where ki £ {0, 1, 2, 3}, and 
if we decompose the unit cube into n 3 voxels using a uniform n-by-n-by-?r grid 
with n = 2 J dyadic, then the individual voxels are the n 3 cells C(ki,k 2 ,k 3 ,J), 
0 < ki,k 2 , k 3 < n. 



C(0, 0.0,0) 





Figure 3. Dyadic cubes. 
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Associated to each dyadic cube we can build a system of lines based on vertex 
pairs. For a dyadic cube Q = C(ki, fc 2 , £ 3 , j) tiled by voxels of side 1/n for a 
dyadic n = 2 J with J > j, let V n {Q) be the set of voxel corners on the faces of 
Q and let B n (Q) be the collection of all line segments generated by vertex-pairs 
from V n (Q). 

Definition 1. We call B n (Q ) the set of 3-D beamlets associated to the cube <5- 
Taking the collection of all dyadic cubes at all dyadic scales 0 < j < J, and all 
beamlets generated by all these cubes, the 3-D beamlet dictionary is the union 
of all the beamlet sets of all dyadic subcubes of the unit cube, and we denote 
this set by B n . 




Figure 4. Vertices on dyadic cubes are always just the points on the faces of the 
cubes. 




Figure 5. Examples of beamlets at two different scales: (a) scale 0 (coarsest 
scale); (b) scale 1 (next finer scale). 



This dictionary of line segments has three desirable properties. 

• It is a multi-scale structure: it consists of line segments occupying a range of 
scales, locations, and orientations. 

• It has controlled cardinality: there are only 0(n 4 ) 3-D beamlets, as compared 
to 0(n e ) beams. 
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• It is expressive : a small number of beamlets can be chained together to ap- 
proximately represent any beam. 

The first property is obvious: the multi-scale, multi-orientation, multi- location 
nature has been obtained as a direct result of the construction. 

To show the second property, we compute the cardinality of B n . By assump- 
tion, our voxel size 1/n has n = 2 J , so there are J + 1 scales of dyadic cubes. 
Of course for any scale 0 < j < J there are 2 3j dyadic cubes of scale j\ each of 
these dyadic cubes contains 2 3 ^ J ~^ voxels, approximately 6 x 2 2 ( J_ - J ) boundary 
vertices, and therefore 18 x 2 A ^ J ~^ 3-D beamlets. 

The total number of 3-D beamlets at scale j is the number of dyadic cubes 
at scale j , times the number of beamlets of a dyadic cube at scale j, which gives 
18x2 4 ' 7_ b Summing for all scales gives a total of approximately 36 x 2 4J = 0(n 4 ) 
elements total. 

We will now turn to our third claim — that the collection of 3-D beamlets is 
expressive. To develop our support for this claim, we will first introduce some 
additional terminology and make some simple observations, and then state and 
prove a formal result. 

3.2. Decompositions of beams into chains of beamlets. In decomposing 
a dyadic cube Q at scale j into its 8 disjoint dyadic subcubes at scale j + 1, we 
call those subcubes the children of Q , and say that Q is their parent. We also 
say that 2 dyadic cubes are siblings if they have the same parent. Terms such 
as descendants and ancestors have the obvious meanings. In this terminology, 
except at the coarsest and finest scales, all dyadic subcubes have 8 children, 7 
siblings and 1 parent. The data cube has neither parents nor siblings and the 
individual voxels don’t have children. We can view the inheritance structure of 
the set of dyadic cubes as a balanced tree where each node corresponds to a 
dyadic cube, the data cube corresponds to the root and the voxel cubes are the 
leaves. The depth of a node is simply the scale parameter j of the corresponding 
cube C(k 1 ,k 2 ,k 3 ,j). 

The dividing planes of a dyadic cube are the 3 planes that divide the cube 
into its 8 children; we refer to them as the axdivider, t/-divider and z-divider. 
For example the axdivider of C(0, 0, 0, 0) is the plane {(1/2, y,z) : 0 < y, z < 1}, 
the ^/-divider is {(x, 1/2, z) : 0 < x, z < 1}, and the z-divider is {{x,y, 1/2) : 0 < 

x,y < 1}. 

We now make a remark about beamlets of data cubes at different dyadic n. 
Suppose we have two data cubes of sizes m = 2 J1 and n 2 = 2 J2 , and suppose 
that n 2 > n-[ . Viewing the two data cubes as filling out the same volume [0, l] 3 , 
consider the beamlets in each system associated with a common dyadic cube 
C(k u k 2 ,k 3 ,j), 0 < j < ji < j 2 . The collection of beamlets associated with 
the ri 2 -based system has a finer resolution than those associated with the n 2 - 
based system; indeed every beamlet in the B ni also occurs in the B n2 . Hence, 
in a natural sense, the beamlet families refine, and have a natural limit, B a 0 , 
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Figure 6. Dividing planes of a cube. 

say. Boo, of course, is the collection of all line segments in [0,1] 3 with both 
endpoints on the boundary of some dyadic cube. We will call members of this 
family the continuum beamlets, as opposed to the members of some B n , which 
are discrete beamlets. Every discrete beamlet is also a continuum beamlet, but 
not the reverse. 

Lemma 1. Divide a continuum beamlet associated to a dyadic cube Q into the 
components lying in each of the child subcubes. There are either one, two, three 
or four distinct components, and these are continuum beamlets. 

PROOF. Traverse the beamlet starting from one endpoint headed toward the 
other. If you travel through more than one subcube along the way, then at any 
crossing from one cube to another, you will have to penetrate one of the x-, y-, 
or ^-dividers. You can cross each such dividing plane at most once, and so there 
can be at most 4 different subcubes traversed. □ 

Theorem 1 . Each line segment lying inside the unit cube can be approximated 
by a connected chain of m discrete beamlets in B n where the Hausdorff distance 
from the chain to the beam is at most 1/n and where the number of links m in 
the chain is bounded above by 6/0(72 (n). 

PROOF. Consider the arbitrary line segment £ inside the unit cube with end- 
points V\ and V 2 that are not necessary voxel corners. We can approximate £ 
with a beam b by replacing each endpoint with the closest voxel corner. Since the 
v/3/(2 n) neighborhood of any point inside the unit cube must include a vertex, 
the Hausdorff distance between £ and b is bounded by \/3/(2 n). 

We now decompose the beam b into a minimal cardinality chain of connected 
continuum beamlets, by a recursive algorithm which starts with a line segment, 
and at each stage breaks it into a chain of continuum beamlets, with remainders 
on the ends, to which the process is recursively applied. 

In detail, this works as follows. If b is already a continuum beamlet for 
(7(0, 0, 0, 0) we are done; otherwise, b can be decomposed into a chain of (at most 
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Figure 7. Decomposition of several beamlets into continuum beamlets at next 
finer scale, indicating cases which can occur. 

four) segments based on crossings of b with the 3 dividing planes of C(0, 0, 0, 0). 
The interior segments of this chain all have endpoints on the dividing planes and 
hence are all continuum beamlets for the cubes at scale j = 1. We go to work 
on the remaining segments. Either endmost segment of the chain might be a 
continuum beamlet for the associated dyadic cube at scale j = 1; if so, we are 
done with that segment; if not, we decompose the segment into its components 
lying in the children dyadic cubes at scale j = 2. Again, the internal segments of 
this chain will be continuum beamlets, and additionally, at least one of the two 
endmost segments will be a continuum beamlet. If both endmost segments are 
continuum beamlets, then we are done. If not, take the segment which is not a 
beamlet and break it into its crossings with the dividing planes of the enclosing 
dyadic cube. Continue in this way until we reach the finest level, where, by 
hypothesis, we obtain a segment which has an endpoint in common with the 
original beam b. Since & is a beam, it ends in a vertex corner, and since the 
segment arose from earlier stages of the algorithm, the other endpoint is on the 
boundary of a dyadic cube. Hence the segment is a continuum beamlet and we 
are done. 

Let’s upperbound the number of beamlets generated by this algorithm. As- 
sume always that we never fortuitously get an end segments to be a beamlet when 
it is not mandated by the above comments. So we have 2 continuum beamlets 
at the 1st scale and we are left with 2 segments to replace by 2 chains of discrete 
beamlets at hirer scales. In the worst case, each of the segments when decom- 
posed at the next scale, generates 3 continuum beamlets and 1 non-beamlet. 
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Continuing to the finest scale, in which the dyadic cubes are the individual vox- 
els, we can have at most 2 beamlets in the chain at the finest scale. So in the 
worst case our chain will include 2 continuum beamlets at the 1st scale, 2 at the 
finest scale and 6 at any other scale 2, 3, ..., J — 1, So we get a maximum total 
of 2 + 6(<7 — 1) + 2 = 6 J — 2 continuum beamlets needed to represent any line 
segment in the unit cube. 

We now take the multiscale chain of beamlets and approximate it by a chain 
of discrete beamlets. The point is that the Hausdorff distance between line 
segments is upperbounded by the distance between corresponding endpoints. 
Now both endpoints of any continuum beamlet in B ^ lie on certain voxel faces. 
Hence they lie within a l/(v / 2 n) neighborhood of some voxel corner. Hence 
any continuum beamlet in Boo can be approximated by a discrete beamlet in B n 
within a Hausdorff distance of l/(v / 2n). Notice that there may be several choices 
of such approximants; we can make the choice of approximant consistently from 
one beamlet to the next to maintain chain connectivity if we like. 

So we get a maximum total of 6J — 2 connected beamlets needed to ap- 
proximate any line segment in the unit cube to within a Hausdorff distance of 
max{\/3/(2n), l/(v / 2u)} < l/?r. □ 

The fact that arbitrary line segments can be approximated by relatively few 
beamlets implies that every smooth curve can be approximated by relatively few 
beamlets. 

To see this, notice that a smooth curve can be approximated to within distance 
1/m 2 by a chain about m line segments — this is a simple application of calculus. 
But then, approximating each line segment in the chain by its own chain of 
6 log(n) beamlets, we get approximation within distance l/m 2 + l/n by 0(log(n)- 
m ) beamlets. Moreover, we can set up the process so that the individual chains of 
beamlets form a single unbroken chain. Compare also [17, Lemma 2.2, Corollary 
2.3, Lemma 3.2], 

4. Vertex-Pairs Transform Algorithms 

Let v = (ki,k 2 ,ks) be a voxel index, where 0 < hi < n and let I(v) be the 
corresponding voxel intensities of a 3D digital image. Let f(x) be the function 
on R 3 that represents the data cube by piecewise constant interpolation — i.e. 
the value f(x) = I(v) when x £ v. 

Definition 2. For each line segment b £ B n , let 7b(-) correspond to the unit 
speed path traversing b. 

The discrete X-ray transform based on global-scale vertex-pairs lines is defined 
as follows. With B n ([ 0, l]) 3 denoting the collection of vertex-pairs line segments 
of associated to the cube [0, l] 3 , 

Xi(b) = J /(76(f)) M, b£B n ([ 0,1]) 3 . 
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The beamlet transform based on multiscale vertex-pairs lines is the collection of 
all multiscale line integrals 

T/(6) = J f( 76W)d£, beB n . 

4.1. Direct evaluation. There is an obvious algorithm for computing beamlet/ 
X-ray coefficients: one at a time, simply compute the sums underlying the defin- 
ing integrals. This algorithm steps systematically through the beamlet dictionary 
using the indexing method we described above, identifies the voxels on the path 
7 b for each beamlet, visits each voxel and forms a sum weighting the voxel value 
with the arc length of 7 b in that voxel. 

In detail, the sum we are referring to works as follows. Let Q(v ) denote the 
cube representing voxel v and 7 b the curve traversing b 

Ti(b) = Length ( 7 b H Q(v)). 

Hence, defining weights Wb{v ) = Length^ (Z) Cl Q(v)) as the arc lengths of the 
corresponding fragments, one simply needs the sum Wb(v)I(v). 

Of course, most voxels are not involved in this sum; one only wants to involve 
the voxels where Wb > 0. The straightforward way to do this, explicitly following 
the curve 7 b from voxel to voxel and calculating the arc length of the fragment of 
curve within the voxel, is inelegant and bulky. A far better way to do this is to 
identify three equispaced sequences and then merge them. Those sequences are: 
(1) the intersections of 7 b with the parallel planes x = Aq/n; (2) the intersections 
with the planes y = fe/n; and (3) the intersections with the planes 2 = k^/n. 
Each of these collections of intersections is equispaced and easy to calculate. It 
is also very easy to merge them in the order they would be encountered in a 
traverse of the beamlet in definite order. This merger produces the sequence of 
intersections that would be encountered if we pedantically tracked the progress 
of the beamlet voxel-by- voxel. The weights Wb(v) are just the distances between 
successive points. 

The complexity of this algorithm is rather stiff: on an n x n x n voxel array 
there are order 0 (n 4 ) beamlets to follow, and most of the sums require 0{n) 
flops, so the whole algorithm requires 0(n 5 ) flops in general. Experimental 
studies will be described below. 

4.2. Two-scale recursion. There is an asymptotically much faster algorithm 
for 3-D X-ray and beamlet transforms, based on an idea which has been well- 
established in the two-dimensional case; see articles of Brandt and Dym [12], by 
Gotze and Druckenmiller [29], and by Brady [9], or the discussion in [21]. 

The basis for the algorithm is the divide and conquer principle. As depicted 
in Figure 7, and proven in Lemma 1, each 3-D continuum beamlet can be de- 
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composed into 2, 3, or 4 continuum beamlets at the next finer scale: 

b=\Jb i (4-1) 

It follows that 

J f{'nW)de = Y, J fiibM))dt 

i 

This suggests that we build an algorithm on this principle, so that for b € B n we 
identify several bi associated to the child dyadic cubes of b, getting the formula 

T I (b) = Y / Ti(b i ). 

i 

Hence, if we could compute all the beamlet coefficients at the finest scale, we 
could then use this principle to work systematically from fine scales to coarse 
scales, and produce all the beamlet coefficients as a result. 

The computational complexity of this fine-to-coarse strategy is obviously very 
favorable: it is bounded by 4 B n flops, since each coefficient’s computation re- 
quires at most 4 additions. So we get an 0(n 4 ) rather than 0(n 5 ) algorithm. 

There is a conceptual problem with implementing this principle, since in gen- 
eral, the decomposition of a discrete beamlet in B n into its fragments at the 
next finer scale (as we have seen) produces continuum beamlets, i.e. the bi are 
in general only in B^, and not B n . Hence it is not really the case that the terms 
Tj(bj) are available from finer scale computations. To deal with this, one uses 
approximation, identifying discrete beamlets bi which are ‘near’ the continuum 
beamlets, and approximates the T/(6,) by combinations of ‘nearby’ T/(6,). 

Hence, in the end, we get favorable computational complexity for an approx- 
imately correct answer. We also get one very large advantage: instead of com- 
puting just a single X-ray transform, it computes all the scales of the multiscale 
beamlet transform in one pass. In other words: it costs the same to compute all 
scales or to compute just the coarsest scale. 

As we have described it, there are no parameters to ‘play with’ to control 
the accuracy, at perhaps greater computational expense. What to do if we want 
high accuracy? Staying within this framework, we can obtain higher precision 
by oversampling. We create an TV x N x N data cube, where N = 2 e n where e 
is an oversampling parameter (e.g. e=3), fill the values from the original data 
cube by interpolation (e.g. piecewise constant interpolation), run the two-scale 
algorithm for Bn, and then keep only the coefficients associated to b £ B^riBn. 
The complexity goes up as 2 4e . 

5. Slope-Intercept Transform Algorithms 

We now develop two algorithms for X-ray transform based on the slope-angle 
family of lines described in Section 2.2. Both are decidedly more sophisticated 
than the vertex-pairs algorithms, which brings both benefits and costs. 
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5.1. The slant stack/shearing algorithm. The first algorithm we describe 
adapts a fast algorithm for the X-ray transform in dimension 2, using this as 
an ‘engine’, and repeatedly applying it to obtain a fast algorithm for the X-ray 
transform in dimension 3. 

5.1.1. Slant Stack The fast slant stack algorithm has been developed by Aver- 
buch et al. (2001) [6] as way to rapidly calculate all line integrals along lines in 
2-dimensional slope/angle form; i.e. either a;-driven 2-dimensional lines of the 
form 

y = sx + t, — n/2 < x < n/ 2; 

where s = k/n for —n<k<n and where — n < t < n or y-driven 2-dimensional 
lines of the form 

x = sy + t, —n/2 < y < n/2, 

where s and t run through the same discrete ranges. The algorithm is approx- 
imate, because it does not exactly compute the voxel- level definition of X-ray 
coefficient assumed in Section 3 above (involving sums of voxel values times 
arc lengths). Instead, it computes exactly the appropriate sums deriving from 
so-called sine-interpolation filters. For the set of ^-driven lines we have 

n/ 2-1 

SlantStack(y = sx + t, I) = I(u,su + z), 

u=—n/2 

where I is a 2D discrete array and I is its 2D sine interpolant. The transform 
for the y-driven lines is defined in a similar fashion with the roles of x and y 
interchanged. The algorithm can obtain approximate line integrals along all lines 
of these two forms in 0(n 2 log(n)) flops, which is excellent considering that the 
number of pixels is 0(n 2 ). It is achieved by using a discrete Projection-Slice 
theorem that relates the Slant Stack coefficients and the 2D Fourier coefficients. 
To be more specific, we are able to calculate the slant stack coefficients by first 
calculating the 2D Fourier Transform of / on a pseudopolar grid (see Figure 
8) and then applying a series of 1-D inverse FFTs along radial lines. Each 
application of the 1-D inverse FFT yields a vector of coefficients that correspond 
to the slant-stack transform of I along a family of parallel lines. 

Figure 9 shows backprojections of different delta sequences, each concentrated 
at a single point in the coefficient space and corresponding to a choice of slope- 
intercept pair. The panels show the 2-D arrays of weights involved in the coef- 
ficient computation. Summing with these weights is approximately the same as 
exactly summing along lines of given slope/intercept. 

As Averbuch et al. point out, the fast slant stack belongs to a group of algo- 
rithms developed over the years in synthetic aperture radar by Lawton [40] and 
in medical imaging by Pasciak [44] and by Edholm and Herman [24], where it is 
called the Linogram. The Linogram has been exploited systematically for more 
than ten years in connection with many problems of medical imaging, including 
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Figure 8. The Pseudopolar Grid is constructed from concentric squares n = 8 are 
converted into data at the intersections of concentric squares and lines radiating 
from the origin with equispaced slopes. 
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Figure 9. 2D Slant Stack Lines. 

cone-beam and fan-beam tomography, which concern image reconstruction from 
subsets of the X-ray transform. In a 3-D context the most closely related work 
in medical imaging concerns the planogram ; see [38; 39], and our discussion in 
Section 10.5 below. The terminology ‘slant stack’ comes from seismology, where 
this type of transform, with different algorithms, has been in use since the 1970’s 

[15]- ' 

5.1.2. Overall Strategy We can use the slant stack to build a 3-D X-ray transform 
by grouping together lines into subfamilies which live in a common plane. We 
then extract that plane from the data cube and apply the slant stack to that 
plane, rapidly obtaining integrals along all lines in that plane. We ignore for the 
moment the question of how to extract planes from digital data when the planes 
are not oriented along the coordinate axes. 
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In detail, our strategy works as follows. Suppose we want to get transform 
coefficients corresponding to x-driven 3-D lines, i.e. lines obeying 



y = s v x + t y , z = s z x + t z . 

Within the family of all n 4 lines of this type, consider the subfamily H X z,n{s z ,t z ) 
of all lines with a fixed value of ( s z ,t z ) and a variable value of (s y , t y ). Such lines 
all lie in the plane P xz (s z , t z ) of (x, y, z ) with (x, y) arbitrary, 2 = s z x + t z . We 
can consider this set of lines as taking all x-driven 2-D lines in the (x, y ) plane 
and then ‘tilting’ the plane to obey the equation z = s z x + t z . Our intention 
is to extract this plane, sampling it as a function of x and y, and use the slant 
stack to evaluate all the line integrals for all the x-driven lines in that plane, 
thereby obtaining all the integrals in L xz , n (s z , t z ) at once, and to repeat this for 
other families, working systematically through values of s z and t z . 

Some of these subfamilies with constant intercept t and varying slope s are 
depicted in Figure 10. 








Figure 10. Planes generated by families of lines in the Slope-Angle dictionary; 
subpanels indicate various choices of slope. 
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In the end, then, our coordinate system for lines has one slope and one inter- 
cept to specify a plane and one slope and one intercept to specify a line within 
the plane. 







Figure 11. Lines selected from planes via slope-intercept indexing. 



5.1.3. 3-D Shearing To carry out this strategy, we need to extract data lying in 
a general 2-D plane within a digital 3-D array. 

We make a simple observation: to extract from the function /( x, y, z ) defined 
on the full cube its restriction to the plane with z = s z x + t z , and x, y varying, 
we simply create a new function f'(x,y,z) defined by 

fix, y, z) = f(x, y,z — s z x - t z ) 

for x,y,z varying throughout [0, l] 3 , with / taken as vanishing at arguments 
outside the unit cube. We then take g(x,y ) = f'{x,y, 0) as our extracted plane. 
The idea is illustrated in Figure 12. 

In order to apply this idea to the case of digital arrays I(x, y, z) defined on a 
discrete grid, note that, in general, 2 — s z x — t z will not be an integer even when 
2 and x are, and so the expression I(x , y,z — s z x — t z ) is not defined; one needs 
to make sense of this quantity somehow. At this point we invoke the notion of 
shearing of digital images as discussed, for example, in [54; 6]. Given a 2-D nxn 
image I(x, y) where — n/2 < x, y < n/2, we define the shearing of y as a function 

(s) 

x at slope s, Shxy , according to 

(Sh^y I) (x, y) = I 2 {x,y-sx). 
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Figure 12. Shearing and slicing a 3D image. Extracting horizontal slices of a 
sheared 3-D image is the same as extracting slanted slices of the original image. 

In words, the image is shifted vertically in each column x =constant, with the 
shift varying from one column to the next in an x-dependent way. Here / 2 (x, y) is 
an image which has been interpolated in the vertical direction so that the second 
argument can be a general real number and not just an integer. Specifically, 

h (x,u) = ^ ~2<l> n (u - v)I(x,v), 

V 

where <p n is an interpolation kernel — a continuous function of a real variable 
obeying cj) n ( 0) = 1, <j) n (k ) = 0 for k yf 0. The shearing of x as a function of y 
works similarly, with 

(Shy S x I) (x, y) =Ii{x~ sy,y), 

with 

h{u,y) = ’Y^4> n (u-v)I(v,y). 

V 

We define a shearing operator for a 3-D data cube by applying a 2-D operator 
systematically to each 2-D planes in a family of parallel planes normal to one of 
the coordinate axes. Thus, if we speak of shearing in 2 as a function of x , we 
mean 

Sh { x s }l(x,y,z) = I 3 (x,y,z - sx). 

What shearing does is map a family of tilted parallel planes into a plane normal 
to one of the coordinate axes. In the above example, data along the plane 
2 = sx + t is mapped onto the plane z = t. Figure 12 illustrates the process 
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graphically, exaggerating the process, by allowing pieces of the original image to 
be sheared out of the original data volume. In fact those pieces ‘moving out’ of 
the data volume get ‘chopped away’ in actual computations. 

5.1.4. The Algorithm Armed with this tool, we define the slant stack based X-ray 
transform algorithm as follows, giving details only for a part of the computation. 
The algorithm works separately with a:-driven, t/-driven, and 2 -driven lines. The 
procedure for a;-driven lines is as follows: 

• for each slope s z 

— Shear 2 as a function of x with slope s z , producing the 3-D voxel array 

^XZ,S z • 

— for each intercept t z 

* Extract the 2-D image I Sz ,t z {x,y ) = I XZtSz (x,y,t z ). 

* Calculate the 2-D X-ray transform of this image, obtaining an array of 
coefficients X(s y ,t y ), and storing these in the array X 3 ('x' , s y ,t y , s z ,t z ). 

— end for 

• end for 

The procedure is analogous for y- and 2 - driven lines. 

The lines generated by this algorithm are as illustrated in Figure 11. 

The time complexity of this algorithm is 0(n 4 log(n)). Indeed, the cost of the 
2-D slant-stack algorithm is order n 2 log(n) (see [6]), and this must be applied 
order n 2 times, one for each member of £j X z,n(s z ,t z ) 

5.2. Compatibility with cache memory. A particularly nice property of 
this algorithm is that it is cache-aware , i.e. it is very well-organized for use with 
modern hierarchical memory computers [32]. In currently dominant computer 
architectures, main memory is accessed at a speed which can be an order of 
magnitude slower than the cache memory on the CPU chip. As a result, other 
things being equal, an algorithm runs much faster if it operates as follows: 

• Load n items from main memory into the cache 

• Work intensively to compute n results 

• Send the n results out to main memory 

Here the idea is that the main computations involve relatively small blocks of 
data that can be kept in cache all at once, are referred to many times while in 
the fast cache memory, saving dramatically on main memory accesses. 

The Slant-Stack/Shearing algorithm we have described above has exactly this 
form. In fact it can be decomposed in steps, every one of which can be concep- 
tualized as follows: 

• Load n items from main memory into the cache 

• Do some combination of: 
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— Compute an n-point forward FFT ; or 

— Compute an n-point inverse FFT ; or 

— Perform elementwise transformation on the n-vector; 

• Send the n results out to main memory 

Thus the 2-D slant stack and the 3-D data shearing operations can all be decom- 
posed into steps of this form. For example, data shearing requires computing 
sums of the form I'(x, y, z ) = Yl u 'K- 2 — sx — u)I(x , y, u). For each fixed (x, y), 
we take the n numbers ( I(x,y,u ) : u = — n/2 , ...,n/ 2 — 1), take their 1-D FFT 
along the last slice, multiply the FFT by a series of appropriate coefficients, and 
then take their inverse 1-D. The story for the slant stack is similar, but far more 
complicated. A typical step in that algorithm involves the 2-D FFT, which is 
obtained by applying order 2 n 1-D FFT’s, once along each row and once along 
each column. For more details see comments in [6]. 

It is also worth remarking that several modern CPU architectures offer FFT 
in silico, so that the FFT step in the above decomposition runs without any 
memory accesses for instruction fetches. Such architectures (which include the 
G4 processor running on Apple Macintosh and IBM RS/6000) are even more 
favorable towards this algorithm. 

As a result of this cache- and CPU-favorable organization the observed behav- 
ior of this algorithm is far more favorable than what asymptotic theory would 
suggest. The vertex-pairs algorithms of the previous section sit at the opposite 
extreme; since those algorithms involve summing data values along lines, and 
the indices of those values are scattered throughout the linear storage allocated 
to the data cube, those algorithms appear to be performing essentially random 
access to memory; hence such algorithms run at the memory access speed rather 
than the cache speed. In some circumstances those algorithms can even run 
more slowly still, since cache misses can cost considerably more than one mem- 
ory access, and random accesses can cause large numbers of cache misses. These 
remarks are in line with behavior we will observe empirically below. 

5.3. Frequency domain algorithm. Mathematical analysis shows that the 
3-D X-ray transform of a continuum function f{x, y, z) can be obtained from the 
Fourier transform [51; 47]. This frequency-domain approach requires coordina- 
tizing planes through the origin in frequency space by 

3\ii,u 2 = {£ = u l£l + U 2^2 } 

extracting sections of the Fourier transform along such planes, 

$(&,&) = /( u l?l + u 26 )> 

and then taking the inverse Fourier transform of those sections: 

g = 3~ 1 g- 
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The resulting function g gives the X-ray transform for lines 

g{x i,x 2 ) = J f(xiv 1 +x 2 v 2 + tv 3 )dt, 

with an appropriate orthobasis (v 1 ,v 2 ,v 3 ). 

To carry this out with digital data would require developing a method to 
efficiently extract many planes through the origin of the Fourier transform cube, 
and then perform 2-D inverse FFT’s of the data in those planes. But how to 
rapidly extract a rich selection of planes through the origin? (The problem 
initially sounds similar to the problem encountered in the previous section, but 
recall that the set of planes needed there were families of parallel planes, not 
families of planes through the origin. 

Our approach is as follows. Pick a fixed preferred coordinate axis, x, say. 
Pick a subordinate axis, z, say. In each constant-y slice, do a two-dimensional 
shearing of the FT data, shearing z as a function of x at fixed slope s z . In 
effect, we have tilted the data cube, so that slices normal to the z-axis in the 
sheared volume correspond to tilted planar slices in the original volume. So now 
take each y-z plane, and apply idea of Cartesian-to-pseudopolar conversion as 
described in [6] . This uses interpolation to convert a planar Cartesian grid into 
a new point set consisting of n lines through the origin at various angles, and 
equispaced samples along each line. This conversion being done for each plane 
with x fixed, then, grouping the data in a given line through the origin across all 
x values produces a plane; see Figure 13. We then take a 2-D inverse transform 
of the data in this plane. 

The computational complexity of the method goes as follows. 0(n 3 log(n)) 
operations are required for transforming from the original space domain to the 
frequency domain; 0(n 2 log(n)) work for each conversion of a Cartesian plane to 




Figure 13. Selecting planes through the origin. Performing cartesian-to-pseudo- 
polar conversion in the yz plane and then gathering all the data for one radial 
line across different values of x produces a series of planes through the origin. 
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pseudopolar coordinates, giving 0(n 3 log(n)) work to convert a whole stack of 
parallel planes in this way; 0(n 3 log(n)) work to shear the array as a function of 
the preferred coordinate; and 3n such shearings need to be performed. Overall, 
we get 0(n 4 ) coefficients in 0(?i 4 log(n)) flops. 

We have not pursued this method in detail, for one reason: it is mathematically 
equivalent to the slant- stack- and- shearing algorithm, providing exactly the same 
results (assuming exact arithmetic). This is a consequence of the projection-slice 
theorem for the slant stack transform proved in [6] . 

6. Performance Measures 

We now consider two key measures of performance of the fast algorithms just 
defined: accuracy and timing. 

6.1. Accuracy of two-scale recursion. To estimate the accuracy of the two- 
scale recursion algorithm, we considered a 16 3 array and compared coefficients 
from two-scale approximation with direct evaluation. We computed the average 
error for the different scales and applied the algorithms both to a 3-D image that 
contains a single beamlet and to a 3-D image that contains randomly distributed 
ones in a sea of zero, chose so that both 3D images has the same 12 norm. The 
table below shows that the coefficients obtained from the two-scale recursion are 
significantly different from those of direct evaluation. 



Analyze Single Beamlet 

scale relative error 


Analyze Random Scatter 

scale relative error 


0 


0.117 


0 


0.056 


1 


0.107 


1 


0.061 


2 


0.076 


0 


0.048 


3 


1.5 x 10" 17 


3 


3.7 x 10” 17 



One way to understand this phenomenon is to look at what the coefficients are 
measuring by studying the equivalent kernels for those coefficients. Let T 1 be the 
linear transform on / corresponding to the exact evaluation of the line integrals 
and let T 2 be the linear transform corresponding to the two-scale recursion 
algorithm. Apply the adjoint of each transform to a coefficient-space vector 
with a one in one position and a zero in other positions, getting 

wi = (T*)’5 b , i = 1,2. (6-1) 

Each w J b lives in image-space — i.e., it is indexed by voxels v, and the entries 
Wb(v) indicate the weights such that Tj[b] = J2 V I{v)wb(v). In essence this ‘is’ 
the beamlet we are using in that beamlet transform. For later use: we call the 
operation of calculating Wb that of ‘backprojection’, because we are going back 
from coefficient space to image space. This usage is consistent with usage of the 
term in the tomographic literature, i.e. [47; 15]. 
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Figure 14. Timing comparison. 

6.2. Timing comparison. The defining feature of 3-D processing is the 
massive volume of data involved and the attendant long execution times for 
even basic tasks. So the burning issue is: how do the algorithms perform in 
terms of CPU time to complete the task? The display in Figure 14 below shows 
that both the direct evaluation and the two scale recursion methods slow down 
dramatically as n increases — one expects a 1/n 5 ^ 3 or l/n 4//3 scaling law to be 
evident in this display, and in rough terms, the display is entirely consistent with 
that law. The surprising thing in this display is the improvement in performance 
of the slant stack with increasing n. This seeming anomaly is best interpreted 
in terms of the cache-awareness of the slant stack algorithm. The slant stack 
algorithm becomes more and more immune to cache misses as n increases (at 
least in the range we are studying), and so the number of cache misses per 
coefficient drops lower and lower for this algorithm, while this effect is totally 
absent for the direct evaluation and two-scale recursion algorithm. 



7. Examples of X-Ray Transforms 

We now give a few examples of the X-ray transform based on the slant stack 
method. 

7.1. Synthesis. While we have not discussed it at length, the adjoint of the 
X-ray transform is a very useful operator; for each variant of the X-ray transform 
that we have discussed, the corresponding adjoint can be computed using ideas 
very similar to those which allowed to compute the transform itself, and with 
comparable computational complexity. Just as the X-ray transform takes voxel 
arrays into X-ray coefficient arrays, the adjoint transform takes X-ray coefficient 
arrays into voxel arrays. 
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We have already mentioned, near (6-1) above, that when the adjoint operator 
is applied to a coefficient array filled with zeros except for a one in a single slot, 
the result is a voxel array. This array contains the weights wi,(v) underlying 
the corresponding X-ray transform coefficient. In formal mathematical language 
this is the Riesz representer of the 6-th coefficient. Intuitively, the representer 
should have its nonzero weights all concentrated on or near the corresponding 
‘geometrically correct’ line. 

To check this, we depict in Figure 15 representers of four different X-ray 
coefficients. Evidently, these are geometrically correct. 











Figure 15. Representers of several X-ray coefficients. 



It is also worth considering what happens if we apply the adjoint to coefficient 
vectors which are ones in various regions and zeros elsewhere in coefficient space. 
Intuitively, the result should be a bundle of lines. Depending on the span of the 
region in slope and intercept, the result might be simply like a thick rod (if only 
intercepts are varying) or like a dumbbell (if only slopes are varying) . To check 
this, we depict in Figure 16 backprojection of six different region indicators. 
With a little reflection, we can see that these are geometrically correct. 

It is of interest to consider backprojection of more interesting coefficient ar- 
rays, such as wavelets with vanishing moments. We have done so and will discuss 
the results elsewhere. 
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Figure 16. X-ray back-projections of various rectangles in coefficient space. 

Note that if the rectangle involves intercepts only, the backprojection is rect- 
angular (until cut off by cube boundary). If the rectangle involves slopes, the 
backprojection is dumbbell-shaped (see lower right) 

7.2. Analysis. Now that we have the ability to generate linelike objects in 3-D 
via backprojection from the X-ray domain, we can conveniently investigate the 
properties of X-ray analysis. 

Consider the example given in Figure 17. A beam is generated by backpro- 
jection as in the previous section. It is then analyzed according to the X-ray 
transform. If the X-ray transform were orthogonal, then we would see perfect 
concentration of the transform in coefficient space, at precisely the location of the 
spike used to generate the beam. However, the transform is not orthogonal, and 
what we see is a concentration but not perfect concentration — in coefficient 
space near the location of the true generator. 

Also, if the transform were orthogonal, the rearranged sorted coefficients 
would have a single nonzero coefficient. As the figure shows, the coefficients 
decay linearly on a semilog plot, indicating power-law decay. The lower right 
subpanel shows the decay of the wavelet-X-ray coefficients that are computed by 
applying a four dimensional periodic orthogonal wavelet transform to the X-ray 
coefficients. As expected, the decay is much faster than the decay of the X-ray 
coefficients. 

8. Application: Detecting Fragments of a Helix 

We now sketch briefly an application of beamlets to detecting fragments of 
a helix buried in noise. We suppose that we observe a cube of noisy 3-D data, 
and that, possibly, the data contains (buried in noise) a filamentary object. By 
‘filamentary object’ we mean the kind of situation depicted in Figure 18. A series 
of pixels overlapping a nonstraight curve is highlighted there, and we imagine 
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Figure 17. X-Ray analysis of a beam, (a) The X-ray transform sliced in the con- 
stant-intercept plane, (b) The X-ray transform sliced in the constant-slope plane. 

(c) The sizes of sorted X-ray coefficients, (d) The sizes of sorted wavelet-X-ray 
coefficients. 

that, when such an object is ‘present’ in our data, that a constant multiple of 
that 3-D template is added to a pure noise data cube. 




Figure 18. A noiseless helix. 

When this is done, we have a situation that is hard to depict graphically, since 
one cannot ‘see through’ such a noisy cube. By this we mean the following: to 
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visualize such a data cube, it seems that we have just two rendering options. 
We can view the cube as opaque, render only the surface, and then we certainly 
will not see what’s going on inside the cube. Or we can view the cube as trans- 
parent, in which case, when each voxel is assigned a gray value based on the 
corresponding data value, we see a very uniformly gray object. 

Being stymied by the task of 3-D visualization of the noisy cube, we instead 
display some 2-D slices of the cube; see the rightmost panel of Figure 19. For 
comparison, we also display the same slices of the noiseless helix. The key point 
to take away from this figure is that the noise level is so bad that the presence 
of the helical object would likely not be visible in any slice through the data 
volume. 



Figure 19. Three orthogonal slices through (a) a noiseless helix; (b) the noisy 
data volume. 

Here is a simple idea for detecting a noisy helix: beamlet thresholding. We 
simply take the beamlet transform, normalize each empirical beamlet coefficient 
by dividing by the length of the beamlet, and then identify beamlet coefficients 
(if any) that are unusually large compared to what one would expect if we were 
in a noise-only situation. 

Figure 20 shows the results of applying such a procedure to the noisy data 
example of Figures 18-19. The extreme right subpanel shows the beamlets that 
were found to have significant coefficients. The center panel shows the result of 
backprojecting those significant beamlets; a rough approximation to the filament 
(far left) has been recovered. 

9. Application: A Frame of Linelike Elements 

We also briefly sketch an application in using the X-ray transform for data 
representation. As we have seen in Section 7.1, the backprojection of a delta 
sequence in X-ray coefficient space is a line-like element. We have so far in- 
terpreted this as meaning that the X-ray transform defines an analysis of data 
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Figure 20. A noiseless helix, a reconstruction from noisy data obtained by 
backprojecting coefficients exceeding threshold, and a depiction of the beamlets 
associated to significant coefficients. 



via line-like elements. But it may also be interpreted as saying that backpro- 
jection from coefficient space defines a synthesis operator, which, for the ‘right’ 
coefficient array, can synthesize a volumetric image from linelike elements. 

The trick is to find the ‘right’ coefficient array to synthesize a given desired 
object. This can be conceptually challenging because the X-ray transform is 
overdetermining, giving order n 4 coefficients for an order n 3 data cube. Iterative 
methods for solving large-scale linear systems can be tried, but will probably be 
ineffective, owing to the large spread in singular values of the X-ray operator. 

There is a way to modify the (slant-stack/shearing) X-ray transform to pro- 
duce something that has reasonably controlled spread of the singular values. This 
uses the fact, as described in Averbuch et al. [6], that there is an effective precon- 
ditioner for the 2-D slant stack operator S (say), such that the preconditioned 
operator S obeys 

co||/||2<||5/|| 2 < Cl ||/|| 2 . 

Here C\/cq <1.1. Hence, the transform from 2-d images to their coefficients is 
almost norm-preserving. In effect, S performs a kind of fractional differentiation 
of the image before applying S. If, in following the construction of the X-ray 
transform that was laid out in Section 5.1, we simply replace each invocation 
of S by S. Then effectively, the transform coefficients, grouped together in the 
families £j XZ: n(s z , t z ) have in each such group, roughly the same norm as the data 
in the corresponding plane 7 XZtn (s z , t z ), say of the data cube. For each fixed 
slope s z , the family of planes fPxz.n^z, tz) with different intercepts t z , fill out the 
whole data cube, and so the norms of all these planes, combined together by a 
sum of squares, gives the squared norm of the whole data cube. It follows that 
the transform of a volumetric image I(x,y,z) should yield a coefficient array 
with l 2 norm roughly proportional to the l 2 norm of the array I. 

Definition 3. The preconditioned X-ray transform X is the result of following 
the prescription for Section 5.1 to build an X-ray transform, only using the 
preconditioned slant stack rather than the slant stack. 
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We should note that in the theory of the continuum X-ray transform [51], there 
is the notion of X-ray isometry, which preserves the L 2 norm while mapping 
from physical space to line space. This can be viewed as applying the X-ray 
transform to a fractional differentiation of the object /, rendering the whole 
system an isometry. The preconditioned digital X-ray operator X we have just 
described is a digital analog, although it does not provide a precise isometry. 

Standard facts in linear algebra (e.g. [28; 30]) imply that, because the output 
norm ||X /||2 is (roughly) proportional to the input norm || T|| 2 , iterative algo- 
rithms (relaxation, conjugate gradients, etc.) should be able to efficiently solve 
equations XI = y. 

The X-ray transform is highly redundant (as it maps n 3 arrays into 0(n 4 ) 
arrays). As a way to obtain greater sparsity, one might consider applying an 
orthogonal wavelet transform to the X-ray coefficients. This will preserve the 
norm of the coefficients, while it may compress the energy into a few large coef- 
ficients. The transform is (naturally) 4-dimensional, but as the display in Figure 
17 suggests, our concern is more to compress in the slope variable where the 
analysis of a beam is spread out, rather than in the intercept variables, where 
the analysis of a beam is already compressed. 

Definition 4. The wavelet-compressed X-ray transform WX is the result of 
applying an orthogonal 4-D wavelet transform to the preconditioned X-ray trans- 
form. 

Label the coefficient indices in the wavelet-compressed X-ray transform domain 
as A € A, and let the entries in WX be labeled a = (cka); they are the wavelet- 
compressed preconditioned X-ray coefficients. 

It turns out that one can reconstruct the original image I from its coefficients 
a. As the wavelet transform is norm-preserving, the map I 1 — > WXI is pro- 
portional to an almost norm-preserving transform, and hence one can go back 
from coefficient space to image space, using iterative linear algebra. Call this 
generalized inverse (linear) transformation WX . Then certainly I = WX a. 

This can be put in a more interesting form. The result of applying this 
generalized inverse transform to a delta coefficient sequence (5a 0 (A) spiking at 
coefficient index Ao (say) provides a volumetric object <f\ 0 (v). Hence we may 
write 

I = ^ 

A 

The object (f>\ is a frame element , and we have thus defined a frame of linelike 
elements in 3-space. Emmanuel Candes in personal correspondence has called 
such things tubelets, although we are reluctant to settle on that name for now 
(tubes being flexible rather than straight and rigid). 

In [16] a similar construction has been applied in the continuum case: a 
wavelet tight frame has been applied to the X-ray isometry to form a linelike 
frame in the continuum R 3 . 
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y 

Figure 21. A frame element. 

This construction is also reminiscent of the construction of ridgelets for rep- 
resentation of continuous functions in 2-D [14]. Indeed, orthonormal ridgelets 
can be viewed as the application of orthogonal wavelet transform to the Radon 
isometry [18]. In [19] a construction paralleling the one suggested here has been 
carried out for 2-D digital data. 

10. Discussion 

We finish up with a few loose ends. 

10.1. Availability. The figures in this paper can be reproduced by code 
which is part of the beamlab package. Point your web browser to http:// 
www-stat.stanford.edu/~beamlab to obtain the software. The software has the 
ability to reproduce all the figures in this paper and has been produced consistent 
with the philosophy of reproducible research. 

10.2. In practice. There are of course many variations on the above schemes, 
but we have restrained ourselves from discussing them here, even when they 
are variations we find practically useful, in order to keep things simple. A few 
examples: 

• We find it very useful to work with an alternative vertex-pair dictionary, where 
the vertices of beamlets are not at corners of boundary voxels for a dyadic 
cube, but instead at midpoints of boundary faces of boundary voxels. 

• We find it useful to work with slight variations of the slant stack defined in 
[6], where the angular spacing of lines is chosen differently than in that paper. 

Rather than burden the reader with such details, we suggest merely that the 
interested reader study the released software. 
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10.3. Beamlet algorithms. As mentioned in the introduction, in this paper 
we have not been able to describe the use of the graph structure of the beamlets 
in which two beamlets are connected in the graph if and only if they have an 
endpoint in common. In all the examples above, each beamlet is treated in- 
dependently of other beamlets. As we showed earlier, every smooth curve can 
be efficiently approximated by relatively few beamlets in a connected chain. In 
order to take advantage of this fact we must use some mechanism for examining 
different beamlet chains. The graph structure affords us such a mechanism. 

This structure can be useful because there are some low complexity, network- 
flow based procedures [43; 27] that allow one to optimize over all paths through 
a graph. Such paths in the beamlet graph correspond to connected chains of 
beamlets. When applied in the multiscale graph provided by 2-D beamlets, 
these algorithms were found in [21] to have interesting applications in detecting 
filaments and segmenting data in 2-D. One expects that the same ideas will prove 
useful in 3-D. 

10.4. Connections with particle physics. In a series of interesting papers 
spanning both 2-D and 3-D applications, David Horn and collaborators Halina 
Abramovicz and Gideon Dror have found several ways to deploy line-based sys- 
tems in data analysis and detector construction [4; 5; 22]. Most relevant to our 
work here is the paper [22] which describes a linelike system of feature detec- 
tors for analysis of data from 3-D particle physics detectors. Professor Horn 
has pointed out to us, and we agree, that such methods are very powerful in 
the right settings, and that the main thing holding back widespread deployment 
of such methods is the immense size of the number of lines needed to give a 
comprehensive analysis of 3-D data. 

10.5. Connections with tomography and medical imaging. The field of 
medical imaging is rapidly developing these days, and particularly in the last few 
years, 3-D tomography has become a ‘hot topic’, with several major conferences 
and workshops. What is the connection of this work to ongoing work in medical 
imaging? 

Obviously, the X-ray transform, as we have defined it, is closely connected 
to problems of medical imaging, which certainly obtain line integrals in 3-space 
and aim to use these to reconstruct the object of interest. 

However, the layout of our X-ray transform is (seemingly) rather different 
than current medical scanners. Such scanners are designed according to physical 
and economic constraints which place various constraints on the line integrals 
which can be observed by the system. In contrast, we have only computational 
constraints and we seek to represent a very wide range of line integrals in our 
approach. For example, in an X-ray system, a source is located at a fixed point, 
and can send out beams in a cone, and the line integrals can be measured by 
a receiving device (film or other) on a planar surface. One obtains many line 
integrals, but they all have one endpoint in common. In a PET system, events 
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in the specimen generate are detected by pairs of detectors collinear with the 
event. One obtains, by summing detector-pair counts over time, an estimated 
line integral. The collection of integrals is limited by the geometry of the detector 
arrays. 

Essentially, in the vertex-pairs transform, we contemplate a situation that 
would be analogous, in PET tomography, to having cubical room, with arrays 
of detectors lining the walls, floor, and ceiling, and with all pairs of detectors 
corresponding to lines which can be observed by the system. In (physical) X-ray 
tomography, our notion of X-ray transform would correspond to a system where 
there is a ‘source wall’ and the rest of the surfaces were ‘receivers’, with the 
specimen or patients being studied oriented successively standing, prone, facing 
and in profile to the ‘source wall’. The (omnidirectional) X-ray source would be 
located for a sequence of exposures at each point of an array on the source wall 
(say). 

Neither situation is quite what medical imaging experts mean when they say 
3-D tomography. For the last ten years or so, there has been a considerable body 
of work on so called cone-beam reconstruction in 3-D physical X-ray tomography; 
see [47; 35]. In an example of such a setting [47], a source is located at a fixed 
point, the specimen is mounted on a turntable in front of a screen, and an 
exposure is made by generating radiation, which travels through the specimen 
and the line integral is recorded by a rectangular array at the the screen. This is 
repeated for each orientation of the turntable. This would be the equivalent of 
observing the X-ray transform only for those lines which originate on a specific 
circle in the z = 0 plane, and is considerably less coverage than what we envisage. 

In PET imaging there are now so-called ‘fully 3-D scanners’, such as the CTI 
EC AT EXACT HR+ described in [46]. This scanner comprises 32 circular de- 
tector rings with 288 detectors each, allowing for a total of 77 x 10 6 lines. While 
this is starting to exhibit some of the features of our system, with very large 
numbers of beams, the detectors are only sensitive to lines occurring within a 
cone of opening less than 30 degrees. The closest 3-D imaging device to our 
setting appears to be the fully 3-D PET system described in [37; 38; 39] where 
two parallel planar detector arrays provide the ability to gather data on all pairs 
of lines joining a point in one detector plane to a point in the other plane. In 
[38] a mathematical analysis of this system has suggested the relevance of the 
linogram (known as slant stack throughout our article) to the fully 3-D problem, 
without explicitly defining the algorithm suggested here. Without doubt, ongo- 
ing developments in 3-D PET can be expected to exhibit many similarities to 
the work in this paper, although it will be couched in a different language and 
aimed at different purposes. 

Another set of applications in medical imaging, to interactive navigation of 
3-D data, is described in [10], based on supporting tools [9; 11; 55] which are 
reminiscent of the two-scale recursive algorithm for the beamlet transform. 
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10.6. Visibility We conclude with a more speculative connection. Suppose we 
have 3-D voxel data which are binary, with a ‘1’ indicating occupied and a ‘0’ 
indicating unoccupied. Then a beam which hits only ‘0’ voxels is ‘clear’, whereas 
a beam which hits some ‘1’ voxels is ‘occluded’. Question: can we rapidly tell 
whether a beam is ‘clear’ or ‘occluded’, for a more or less random beam? 

The question seems to call for rapid calculation of line integrals along every 
possible line segment. Obviously, if we proceed in the ‘obvious’ way, the algo- 
rithmic cost of answering a such a query is order n, since there are line segments 
containing order n voxels. 

Note that, if we precompute the beamlet transform, we can approximately 
answer any query about the clarity of a beam in order 0(log(n)) operations. 
Indeed the beam can written as a chain of beamlets, and we merely have to 
examine all those beamlet coefficients checking that they are all zero. There are 
only 0(log(n)) coefficients to check, from Theorem 1 above. 

We can also rapidly determine the maximum distance we can go along a ray 
before becoming occluded. That is, suppose we are at a given point and might 
want to travel in a fixed direction. How far can we go before hitting something? 

To answer this, consider the the segment starting at our fixed point and head- 
ing in the given direction until it reaches the boundary of the data cube — we 
obviously wouldn’t want to go out of the data cube, because we don’t have infor- 
mation about what lies there. Take the segment and decompose into beamlets. 
Now check that all the beamlets are ‘clear’, i.e. have beamlet coefficients zero. 
If any are not clear, go to the occluded beamlet closest to the origin, and divide 
it into its (at most four) children at the next level. If any are not clear, go to the 
occluded beamlet closest to the origin, and, once again, divide it into its (at most 
four) children at the next level. Continuing in this way, we soon reach the finest 
level, and determine the closest occlusion along that beam. The algorithm takes 
0(log(n)) operations, assuming the beamlet transform has been precomputed. 

This allows for rapid computation of what might be called safety graphs, 
where for each possible heading one might consider taking from a given point, 
one obtains the distance one can go without collision. The cost is proportional 
to # headings x log(n), which seems to be quite reasonable. 

Traditional visibility analysis [23] assumes far more about the occluding ob- 
jects (e.g. polyhedral structure); perhaps our approach would be more useful 
when occlusion is very complicated and arises in natural systems subject to di- 
rect voxel wise observation. 
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Abstract. We give an overview of phylogenetic invariants: a technique for 
reconstructing evolutionary family trees from DNA sequence data. This 
method is useful in practice and is based on a number of simple ideas 
from elementary group theory, probability, linear algebra, and commutative 
algebra. 



1. Introduction 

Phylogeny is the branch of biology that seeks to reconstruct evolutionary fam- 
ily trees. Such reconstruction can take place at various scales. For example, we 
could attempt to build the family tree for various present day indigenous popula- 
tions in the Americas and Asia in order to glean information about the possible 
course of migration of humans into the Americas. At the level of species, we 
could seek to determine whether modern humans are more closely related to 
chimpanzees or to gorillas. Ultimately, we would like to be able to reconstruct 
the entire “tree of life” that describes the course of evolution leading to all present 
day species. Because the status of the “leaves” on which we wish to build a tree 
differs from instance to instance, biologists use the general term taxa (singular 
taxon ) for the leaves in a general phylogenetic problem. 

For example, for 4 taxa, we might seek to decide whether the tree 




Taxon 1 Taxon 2 Taxon 3 Taxon 4 
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or the tree 




describes the course of evolution. In such trees: 

• the arrow of time is down the page, 

• paths down through the tree represent lineages ( lines of descent ), 

• any point on a lineage corresponds to a point of time in the life of some 
ancestor of a taxon, 

• vertices other than leaves represent times at which lineages diverge, 

• the root corresponds to the most recent common ancestor of all the taxa. 

Phylogenetic reconstruction has a long history. Classically, reconstruction was 
based on the observation and measurement of morphological similarities between 
taxa with the the possible adjunction of similar evidence from the fossil record; 
and these methods continue to be used. However, with the recent explosion in 
technology for sequencing large pieces of a genome rapidly and cheaply, recon- 
struction from the huge amounts of readily available DNA sequence data is now 
by far the most commonly used technique. Moreover, reconstruction from DNA 
sequence data has the added attraction that it can operate fairly automatically 
on quite well-defined digital data sets that fit into the framework of classical 
statistics, rather than proceeding from a somewhat ill-defined mix of qualitative 
and quantitative data with the need for expert oversight to adjust for difficulties 
such as morphological similarity due to convergent evolution. 

There is a substantial literature on both the mathematics behind various 
approaches to phylogenetic reconstruction and the algorithmic issues that arise 
when we try to implement these approaches with large amounts of data and 
large numbers of taxa. We won’t attempt to survey this literature or provide a 
complete bibliography. Rather, these lecture notes are devoted to some of the 
mathematics behind one particular approach: that of phylogenetic invariants. 
Not only is this technique of practical utility, but it requires a nice combination 
of elementary group theory, probability, linear algebra, and commutative algebra. 

The outline of the rest of these notes is as follows. Section 2 begins with 
a discussion of the sort of DNA sequence data that are used for phylogenetic 
reconstruction and how these data are pre-processed using sequence alignment 
techniques. We then describe a very general class of “Markov random held” 
models that incorporate arbitrary mechanisms for nucleotide substitution and 
a dependence structure for the nucleotides exhibited by the taxa that mirrors 
the phylogenetic tree. Section 3 introduces 3 restricted classes of substitution 
mechanisms that are commonly used in the literature: the Jukes-Cantor model 
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and the 2- and 3-parameter Kimura models. We observe in Section 4 that stan- 
dard statistical techniques such as maximum likelihood are still computationally 
very demanding for infering phylogenies even for such restricted models and we 
propose the alternative approach of phylogenetic invariants. We point out in 
Sections 5 and 6 that an underlying group structure is present in the restricted 
substitution models and develop the Fourier analysis that is necessary for ex- 
ploiting this group structure to construct and recognise invariants. 

Section 7 is a warm-up that uses these algebraic tools to exhibit an invariant 
for a particular tree. The ideas in this section are then generalised in Section 
8 to characterise the class of all invariants for an arbitrary tree. Finally, we 
determine the “dimension” of the space of invariants for an arbitrary tree in 
Section 9 and show in Section 10 that different trees have different invariants, 
with the “dimension” of the class of distinguishing invariants depending in a 
simple manner on the difference between the two trees. 

2. Data and General Models 

We assume that reader is familiar with the basic notion of the hereditary 
information of organisms being carried by DNA molecules that consist of two 
linked chains built from an alphabet of four nucleotides and twisted around each 
other in a double helix, and, moreover, that such a molecule can be described by 
listing the sequence of the nucleotides encountered along one of the chains using 
the letters A for adenine, G for guanine, C for cytosine, T for thymine. A lively 
and entertaining guide to the fundamentals is [GW91]. 

The totality of the DNA in any somatic cell constitutes the genome of the 
individual. The genomes of different individuals differ. As evolution occurs, one 
nucleotide is substituted for another, segments of DNA are deleted, and new 
segments are inserted. 

Sequence alignment is a procedure that attempts to provide algorithms that 
takes DNA sequences from several taxa, line up “common positions” at which 
substitutions may or may not have occurred, and determine where deletions and 
insertions have occurred in certain sequences relative to the others. For example, 
an alignment of two taxa might produce an output such as the following: 

Taxon 1 ... AGTAACT... 

Taxon 2 ...AT * * * CA... 

Reading from left to right: both taxa have an A in the “same” position, the next 
position is common to both taxa but Taxon 1 has a G there whereas Taxon 2 
has a T, then (due to insertions or deletions) there is a stretch of 3 positions that 
are present in the genome of Taxon 1 but not present in the genome of Taxon 2 
etc. There are many approaches to deriving such alignments, and a discussion 
of them is outside the scope of these notes. A good introduction to some of the 
mathematical issues is [Wat95]. 
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Our basic data are DNA sequences for each of our taxa that have been pre- 
processed in some suitable way to align them. For simplicity, we suppose that we 
are dealing with segments where there have been no insertions or deletions, so all 
the taxa share the same common positions and differences between nucleotides 
at these positions are due to substitutions. 

The standard statistical paradigm dictates (in very broad terms) how we 
should go about taking these data and producing inferences about the phylogeny 
connecting our taxa. Firstly, we should begin with a probability model that 
incorporates the possible trees as a “parameter” along with other parameters 
that describe the mechanism by which substitutions occur relative to such a 
tree. Secondly, we should determine the choice of parameters (in particular, 
the choice of tree) that best fits the observed sequence data according to some 
criterion. 

A standard assumption in the literature is that the behaviour at widely sepa- 
rated positions on the genome is statistically independent. With this assumption, 
the modelling problem reduces to one of modelling the nucleotide observed at a 
given position. 

In order to describe the general class of single position models typically used 
in the literature, it is easiest to begin by imagining that we can observe not 
only the nucleotides for the taxa but also those for the unobserved intermediates 
represented by the interior vertices of the tree. (For simplicity, let us refer to the 
taxa and the intermediates as “individuals” for the moment.) Two individuals 
share the same lineage up to their most recent common ancestor and so the 
processes such as mutation leading to substitution act on the genomes of their 
common ancestors in the same way up until the split in lineages that occurs at the 
most recent common ancestor. After the split in lineages, it is a reasonable first 
approximation to assume that the random mechanisms by which substitutions 
occur are operating independently on the genomes of the ancestors that are no 
longer shared. Mathematically, this translates into an assumption that that 
the nucleotides exhibited by two individuals are conditionally independent given 
the nucleotide exhibited by their most recent common ancestor. Equivalently, 
the nucleotides exhibited by two individuals are conditionally independent given 
the nucleotide exhibited by any individual on the path that connects the two 
individuals in the tree. 

For example, consider the tree 




12 3 4 



with four taxa. Letting Y % denote the nucleotide exhibited by individual i, we 
have, for example, that 

• Y-\ and Y2 are conditionally independent given Y5, 
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• the pair (Yi,Y 2 ) are conditionally independent of the pair (13,14) given any 
one of Y5, Y 6 , or Y 7 . 

Because of this dependence structure, a joint probability such as 

p{y 4 =a, v 2 =a, y 3 =g, y 4 =c, y 5 =t , y 6 =t , y 7 =a } 

can be computed as 

P {Y 7 = A} x P{Y 5 = T I Y 7 = Aj x P {Y 6 = T I Y 7 = Aj x P{Y 1 = A I Y 5 = T} 

x P{Y 2 = A I Y 5 = T} x P {Y 3 = G I Y 6 =T} x P{Y 4 = C | Y 6 =T}. 

Thus, for a given tree, the joint probabilities of the individuals exhibiting a par- 
ticular set of nucleotides are determined by the vector of 4 unconditional proba- 
bilities for the root individual and the 4x4 matrices of conditional probabilities 
for each edge. 

Given such a model for the nucleotides exhibited by all the individuals (taxa 
and intermediates), we obtain a model for the nucleotides exhibited by the taxa 
by taking the marginal probability distribution for the taxa. Operationally, this 
just means that we sum over the possibilities for the intermediates. 

For example, suppose that we have the tree 



3 




1 2 



with two taxa. Then, for example, 

T>{Y 1 =A,Y 2 = G} = P{Y 1 = A,Y 2 = G,Y 3 = A}+P{Y 1 = A,Y 2 = G,Y 3 = G} 

+ p{y 1 = A, y 2 =g, Y 3 =C} + P{Y 1 =A, y 2 =g, y 3 =t} 
= P{F 3 = 4l}P{ri = A I Y 3 =A}P{Y 2 = G I Y 3 = A} 

+ P{Y 3 = G} x P{Fi = A I Y 3 = G} x P{F 2 = G I Y 3 = G} 
+ P{T 3 = G} x P {Y 1= A I Y 3 = C} x P{F 2 = G I Y 3 = C} 
+ P{Y 3 = T} x P {Y 4 = A I Y 3 = T} x P{F 2 = G I Y 3 = T}. 

We now introduce some notation to describe in full generality the sort of 
model we have just outlined. 

Let T be a finite rooted tree. Write p for the root of T, V for the set of 
vertices of T, and L C V for the set of leaves. We regard T as a directed graph 
with edge directions leading away from the root. The elements of L correspond 
to the taxa, the tree T is the phylogenetic tree for the taxa, and the elements of 
V\L correspond to ancestors alive at times when the lineages of taxa diverge. 
It is convenient to enumerate L as (Zi, . . . , l m ) and V as (vi , . . . , v n ), with the 
convention that lj = Vj for j = 1 , . . . , m and p = v n . 

Each vertex v £ V other than the root p has a a father a(y) (that is, there 
is a unique cr(i>) € V such that the directed edge (a(v),v) is in the rooted tree 
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T.) If v a and v u are two vertices such that there exist vertices vp, v 1 . . . , with 
<j(yp ) = v a , cr(i> 7 ) = vp, . . . ,<j(v u ) = (that is, there is a directed path in T 
from a to w), then we say that v u is a descendent of v a or that v a is an ancestor 
of v u and we write v a < v u or v u > v a . Note that a vertex is its own ancestor and 
its own descendent. The outdegree outdeg(u) of u G V is the number of children 
of u, that is, the number of v G V such that u = cr(v). To avoid degeneracies 
we always suppose that outdeg(v) > 2 for all v € V\L. (Note: Terms such as 
“father” and “child” are just standard terminology from the theory of trees and 
don’t have any biological significance — an edge in our tree may correspond to 
thousands of actual generations.) 

Let 7r be a probability distribution on {A, G,C,T} -the root distribution, 
The probability n(B) is the probability that the common ancestor at the root 
exhibits nucleotide B. For each vertex v G V\{p}, let P^ be a stochastic 
matrix on {A, G,C,T} (that is, the rows of P ^ are probability distributions on 
{A, G, C,T}.) We refer to P l ' v ' 1 as the substitution matrix associated with the 
edge (cr(u),u). The entry P^ V \B',B") is the conditional probability that the 
individual at vertex v exhibits nucleotide B" given that the individual at vertex 
<t(v) exhibits nucleotide B' G {A,G,C,T}. 

Define a probability distribution p on {A, G, C,T} V by setting 

M(B»)t>ev) := 7r(B p ) n P {v) (B a{v) ,B v ). 

»ev\{ P } 

The distribution p is the joint distribution of the nucleotides exhibited by all of 
the individuals in the tree, both the taxa and the unobserved ancestors. The 
induced marginal distribution on {A, G,C, T} L is 

gl) := E E / J ((( B »)»ev\L, (B e )i 6 l)), 

u£V\L B v 

where each of the dummy variables B v , v G V\L, is summed over the set 
{A,G,C,T}. The distribution p is the joint distribution of the nucleotides ex- 
hibited by the taxa. 

With this model in hand, we could try to make inferences from sequence 
data using standard statistical techniques. For example, we could apply the 
method of maximum likelihood where we determine the choice of the parameters 
T, 7 r, and P( v \ v G V\{p}, that makes the probability of the observed data 
greatest. (As we discussed above, we would need to observe the nucleotides at 
several positions and assume they were independent and governed by the same 
single-position model.) Maximum likelihood is known to have various optimality 
properties when we have large numbers of data, but unless we have just a few 
taxa there are a huge number of parameters over which we have to optimise 
and implementing maximum likelihood directly is numerically infeasible. There 
are various approaches to overcoming these difficulties — for instance, we can 
maximise likelihoods 4 taxa at a time and hope to fit the subtrees inferred in this 
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manner into one overall tree for all the taxa. Another approach is to constrain 
the substitution matrices in some way and hope that the extra structure this 
introduces makes the inferential problem easier to solve (while still retaining some 
degree of biological plausibility.) That is the approach we will follow starting in 
the next section. 



3. More Specific Models 

The general model for the observed nucleotides outlined in the Section 2 allows 
the substitution matrices to be arbitrary. As we discussed in the Section 2, there 
are practical reasons for constraining the form of these matrices. 

The substitution matrix represents the cumulative effect of the substitu- 
tions that occur between the times that the individuals associated with cr(v ) and 
v were alive. In order to arrive at a reasonable form for P^ v \ it is profitable to 
think about how we would go about modelling the dynamics of this substitution 
process. 

The most natural and tractable dynamics are (time-homogeneous) Markovian 
ones. That is, if the position currently exhibits a certain nucleotide, B' say, then 
(independently of the past) the nucleotide changes at rate r(B ' , B") to some 
other nucleotide B" . More formally, if the position currently exhibits nucleotide 
B ', then: 

• independently of the past, the probability that the elapsed time until a change 
occurs is greater than t is exp(— YIb" r (B l , B") t), 

• independently of how long it takes until a change occurs, the probability that 
it is to B" is proportional to r(B' , B"). 

There are obvious caveats in the use of such Markov chain models. Certain 
positions on the genome can’t be altered without serious consequences for the 
viability of the organism, and so a model that allows substitution to occur in a 
completely random fashion is not appropriate at such positions. However, if we 
look at positions that are not associated with regions of the genome that have 
an identifiable function, then it is somewhat difficult to recognise two positions 
as being the “same” in two different individuals for the purposes of alignment. 
Some care is therefore necessary in practice to find positions that can be aligned 
but are such that a Markov chain model is plausible. 

The simplest Markov chain model for nucleotide substitution is the Jukes- 
Cantor model [JC69; Ney71] in which r(B' , B") is the same for all B ' , B" . Under 
this model, the distribution of the amount of time spent at a nucleotide before 
a change occurs does not depend on the nucleotide and all 3 choices of the new 
nucleotide are equally likely when a change occurs. 

Biochemically, the nucleotides fall into two families: the purines (adenine and 
guanine) and the pyrimidines (cytosine and thymine). Substitutions within a 
family are called transitions, and they have a different biochemical status to 
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substitutions between families, which are called transversions. Kimura [Kim80] 
proposed a model that recognised this distinction by assigning a common rate 
to all the transversions and possibly different common rate to all the transitions. 
We can represent the rates schematically as follows: 



A^-^C 

A > ■'t A 

\ / 

/ \ 

I > AY 

G^-^T 



The solid arrows represent transitions and the dashed arrows represent transver- 
sions. There are two rate parameters, a,/3 > 0, say, such that r(B',B") = a if 
B' and B" are connected by a solid arrow, and r(B ' , B") = (3 if B' and B" are 
connected by a dashed arrow. 

Later, Kimura [Kim81] introduced a generalisation of this model with the 
following rate structure: 



A^-^C 







Now there are 3 types of arrows (solid, dashed, and double) and 3 corresponding 
rate parameters (a, (3, 7 > 0, say.) For example, if the current nucleotide is A 
then, independently of the past, the probability that it takes longer than time t 
until a change is exp (— (a + P + j)t) and, independently of how long it takes until 
a change, the change is to G with probability a/(a+P+ r y), to C with probability 
(3/(a + (3 + 7), and to T with probability 7/(0 + (3 + 7). There does not appear 
to be a convincing biological rationale for this model with (3 ^ 7. However, 
the extra parameter allows some more flexibility in fitting to data. Moreover, 
the analysis of the three-parameter model is no more difficult than that of the 
two-parameter one, and is even somewhat clearer from an expository point of 
view. We refer the reader to [ES93; EZ98] for the changes that are necessary in 
what follows when dealing with the one- and two-parameter models. 

Probabilists usually record the rates for a Markov chain as an infinitesimal 
generator matrix. For example, the infinitesimal generator for the three-param- 
eter Kimura model is 



Q = 



A 

A / — (a + (3 + 7) 
G a 

C 1 3 

T\ 7 



G 

a 

-{a + (3 + 7) 
7 
(3 



C 

(3 

7 

-(a + /3 + 7) 

a 



T 

7 \ 

0 

a 

-{a + (3 + 7 ) ) 
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The infinitesimal generator is more than just an accounting device: for any 
s, t > 0 the entry in row B' and column B" of the matrix 

exp(«2) = / + tQ + — + ' ' ' 

gives the conditional probability that nucleotide B" will be exhibited at time 
s + t given that nucleotide B' is exhibited at time s. 

Because the matrix Q is symmetric, exp (t,Q) can be computed using the spec- 
tral theorem once the eigenvalues and eigenvectors of Q have been computed. 
This is straightforward for Q, but we won’t go into the details. Also, the diago- 
nalisation follows easily using the Fourier ideas of Section 6. As an example, the 
conditional probability that nucleotide A will be exhibited at time s + t given 
that nucleotide A is exhibited at time s is 

1(1 + exp(-2f(a + 7)) + exp(-2f(/3 + 7)) + exp(-2f(a + /?))), 

and the the conditional probability that nucleotide G will be exhibited at time 
s + t given that nucleotide A is exhibited at time s is 

\ (l — exp(— 2f(a + 7)) + exp(-2f(/3 + 7)) - exp(-2f(a + /?))). 

Both of these probabilities converge to \ as t — > 00: of course, we expect from 
the symmetries of the Markov chain that if it evolves for a long time, then it will 
converge towards an equilibrium distribution in which all nucleotides are equally 
likely to be exhibited. 

It is clear without computing exp(tQ) explicitly that this matrix is of the form 





A 


G 


c 


T 


A 


/ w 


X 


y 


z \ 


G 


X 


w 


z 


y 


C 


V 


z 


w 


X 


T 


\z 


y 


X 


w ) 



where 0 < w,x, y, z < 1. Not all such matrices are given by exp(fQ) for a suitable 
choice of a, (3, 7, t. However, we suppose from now on that each substitution 
matrix is of this somewhat more general form for some w, x, y, z (that can 
vary with v.) Thus, once a tree T with m leaves and n vertices is fixed, there 
are 3 n independent parameters in the model: 3 for the root distribution 7 r and 
3 for each of the n — 1 substitution matrices. Note that each of the 4 m model 
probabilities p((B()^ e L ), (B^)^ e L G {A, G, C, T} L is a polynomial in these 3 n 
variables. 



4. Making Inferences 



From the development in Sections 2 and 3, we have a model for the joint 
probability of the taxa exhibiting a particular set of nucleotides. For more than 
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a small number of taxa, this model still has too many parameters for us to 
apply maximum likelihood. Moreover, maximum likelihood necessarily estimates 
all the numerical parameters in the model, even though the tree parameter is 
typically the one that is of most interest. 

An alternative approach to estimating the tree that does not involve directly 
estimating the numerical parameters was suggested in [CF87] and [Lak87]. The 
ideas behind this approach is as follows. For a given tree T, the model probabili- 
ties p((Be)(£ l), (-Bf)feL G {A, G, C, T} L , have a specific functional form in terms 
of the numerical parameters defining the root distribution and the substitution 
matrices (indeed, the model probabilities are polynomials in these variables.) 
This should constrain the model probabilities to lie on some lower dimensional 
surface in R L . Rather than represent this surface explicitly as the range of a vec- 
tor of polynomials, we could try to characterise the surface implicitly as a subset 
of a locus of points in R L that are common zeroes of a family of polynomials. 
That is, we want to represent the surface as a subset of an algebraic variety. 

Because we assuming that the same model (with the same numerical substitu- 
tion mechanism parameters) governs each position in our data set and that the 
behaviour at different positions is independent, the strong law of large numbers 
gives that the quantities p((Be)e & l), l G {A, G, C, T} L , can be consis- 

tently estimated in a model-free way by computing the proportion of positions 
in our data set at which Taxon 1 exhibits nucleotide B i, Taxon 2 exhibits nu- 
cleotide B 2l etc. Call these estimates p((Be)e e l), (-B^)^eL e {A, G, C, T} l , so 
that p{{Bi)z e l) will be close to p((Bf)^ l) with high probability when we observe 
a sufficient number of different positions to have enough independent identically 
distributed data points for the strong law of large numbers to kick in. 

We hope that the varieties for two different trees (say, Tree I and Tree II) have 
a “small” intersection and so a “generic” point on the variety for one tree will not 
be a common zero of the polynomials defining the variety for the other tree. That 
is, we hope that we can find a polynomial / such that f{p{{B ()^ L )) = 0 for all 
choices of substitution mechanism parameters for Tree I whereas f(p{{Be)e e l)) 7^ 
0 for all but a “small” set of choices of substitution mechanism parameters for 
Tree II. If this is the case, then f(p((Bi)t & l)) should be close to zero (that is, 
“zero up to random error” ) if Tree I is the correct tree regardless of the numerical 
parameters in the model, whereas this quantity should be “significantly nonzero” 
if Tree II is the correct tree unless we have been particularly unfortunate and 
the numerical parameters are such that the vector p((Bg)t e l) happens to lie on 
the intersection of the varieties for the two trees. 

The polynomials that are zero on the algebraic variety associated with a tree 
are called the (phylogenetic) invariants of the model. Note that the set of in- 
variants has the structure of an ideal in the ring of polynomials in the model 
probabilities: the sum of two invariants is an invariant and the product of an 
invariant with an arbitrary polynomial is an invariant. 
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In order to use the invariant idea to reconstruct phylogenetic trees we need 
to address the following questions: 

i) How do we recognize when a polynomial is an invariant? 

ii) How do we find a generating set for the ideal of invariants (and how big is 
such a set)? 

iii) Do different trees have different invariants? 

iv) How do we determine whether a vector of polynomials applied to estimates 
of the model probabilities is “zero up to random error” or “significantly 
nonzero” ? 

In principle, questions (i) and (ii) can be answered using general theory from 
computational commutative algebra. There is an algorithm using Grobner bases 
that solves the implicitization problem of finding a generating set for the ideal 
of polynomials that are 0 on a general parametrically given algebraic variety 
(see [CL092].) Unfortunately, this algorithm appears to be computationally 
infeasible for the size of problem that occurs for even a modest number of taxa. 
Other methods adapted to our particular problem are therefore necessary, and 
this is what we study in these notes. Along the way, we answer question (iii) 
and even establish how many algebraically independent invariants there are that 
distinguish between two trees. We don’t deal with the more statistical question 
(iv) in these notes. 



5. Some Group Structure 

We begin with a step that may seem somewhat bizarre at first, but pays 
off handsomely. Consider the Klein 4~ group Z 2 ® Z 2 consisting of the elements 
{(0, 0), (0, 1), (1, 0), (1, 1)} equipped with the group operation of coorclinatewise 
addition modulo 2. The addition table for Z 2 ® Z 2 is thus 



+ 


(0,0) 


(0,1) 


(1,0) 


(1,1) 


(0,0) 


/(0,0) 


(0,1) 


(1,0) 


(1,1) \ 


(0,1) 


(0,1) 


(0,0) 


(1,1) 


(1,0) 


(1,0) 


(1,0) 


(1,1) 


(0,0) 


(0,1) 


(1,1) 


V(i,i) 


(1,0) 


(0,1) 


(0,0)/ 



Identify the nucleotides {A, G, C, T} with the elements of Z 2 ® Z 2 as follows: 
A 4-> (0,0), G <-► (0,1), (1,0), and T <-> (1,1). This turns G := {A,G,C,T} 

into a group with the addition table 



+ 


A 


G 


C 


T 


A 


( A 


G 


C 


T \ 


G 


G 


A 


T 


C 


c 


C 


T 


A 


G 


T 


\T 


C 


G 


a) 
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Suppose that X and Y are two G-valued random variables such that the 
conditional distribution of Y given X is described by the matrix 





A 


G 


c 


T 


A 


/ w 


X 


y 


z \ 


G 


X 


w 


z 


y 


C 


y 


z 


w 


X 


T 


\z 


y 


X 


w ) 



Note that P{Y = B" \ X = B'} only depends on the pair of nucleotides (B', B") 
through the difference B" — B' . It follows easily from this that the joint dis- 
tribution of the pair (X, Y) is same as that of the pair (X, X + Z), where 
P{Z = A} = w, P {Z = G} = x, P {Z = C} = y, P{Z = T} = z, and Z is 
independent of X. 

The model that we described in Section 3 had an arbitrary root distribution 
7r and substitution matrices P ^ that satisfy P( V \B' , B") = q( v ){B" — B') for 
some probability distribution on G. Repeatedly applying the observation 
of the previous paragraph shows that if if (Z„).„ e v is a vector of independent 
G-valued random variables, with Z p having distribution 7 r, and Z v , v € V\{p}, 
having distribution q' v \ then the G-valued random variables 

Y ( := ^ Z,,, £GL, 

V<1 

have joint distribution 



P{Yi = B 1 ,...,Y m = B m } = p((B t )eei,)- 



That is, by suitable addition of independent G-valued “weights,” we can con- 
struct a vector of random variables having the same joint distribution as the 
nucleotides exhibited by the taxa. For example, for the tree 



the construction is 

Yi = 

Y 2 = 

r 3 = 




+ 

+ 

+ 



Zr> 



6. A Little Fourier Analysis 

We’ve seen that the model of Section 3 can be represented in terms of sums 
of indpendent random variables taking values in a finite, Abelian group. Prob- 
abilists have known for a long time that Fourier analysis is a very powerful 
technique for handling such sums. In this section we’ll review some basic facts 
about Fourier analysis for an arbitrary finite, Abelian group (H, +). 




FOURIER ANALYSIS AND PHYLOGENETIC TREES 



129 



Let T = {;jGC:|z| = 1} denote the unit circle in the complex plane, and 
regard T as an Abelian group with the group operation being ordinary complex 
multiplication. The characters of H are the group homomorphisms mapping 
H into T. That is, % : H — > T is a character if xi^i + ^ 2 ) = x(/ii)x(/i 2 ) for 
all /ii,/i2 G G. The characters form an Abelian group under the operation of 
pointwise multiplication of functions. This group is called the dual group of H 
and is denoted by EL The groups El and H are isomorphic. Given h G El and 
X G El, write ( h,x ) for xV 1 )- 

The elements of El form an orthogonal basis for the space of functions from 
El to C. Given a function / : H — > C, the Fourier transform of / is the function 
/ : El — > C given by 

fix) = E f( h )( h ’X)- 

heM 

A function can be recovered from its Fourier transform via Fourier inversion : 

f(h) = fix)(h,x)- 

X6H 

Given two finite, Abelian groups EL and El", the dual of the product group 
El" © El" is isomorphic to EF © El" via the identification 

((h , ,h"),(x',x")) = (h',X , )x(h",x")- 

One may write G = where the following table gives the values 

of (g, X) for g G G and x G G: 





(0,0) 


(0,1) 


(1,0) 


(1,1) 


1 


( 1 


1 


1 


1 


<t> 


1 


-1 


1 


-1 


if 


1 


1 


-1 


-1 


W> 


l 1 


-1 


-1 


1 



The characteristic function of a H-valued random variable X is the Fourier 
transform of its probability mass function: 

ax) = E = h aax) = e [(x, X )] 

heu 

(here, following the usual convention in probability theory, ( X , x) is the random 
variable obtained by composing the random variable X with the function (-,x)-) 
The probability mass function of X can be recovered from its Fourier transform 
by Fourier inversion: 

p i x = h } = ^J2 ax)M- 

XSH 
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Finally, note that if X' and X" are independent H-valued random variables, 
then 

E[(X' + A", x)\ = n{X', x)(X", x)] = n(X', x)M{X", x)}- 

That is, the characteristic function of X' + X" is the product of the characteristic 
functions of X' and X" . 



7 . Finding an Invariant 

Let’s begin by seeing how the observations of Sections 5 and 6 can be used to 
find an invariant for an instance of the model of Section 3. 

Consider the tree 5 




1 2 3 

with the associated model for the nucleotides Y\,Y 2 . Y 3 exhibited by the taxa 
written in terms of independent G-valued random variables Z\, . . . , Z§ as follows: 

Y\ = Z\ + Z\ + Z 3 

Y 2 = Z 2 + Z A + Z 3 
I3 = Z 3 + Z 3 

Using the results of Section 6 and the notation given there for for the charac- 
ters of G we have 

E[{Y u 4>){Y 2 ,<f>){Y 3 ,iP)] 

= E[( z u <p)(z 4 , 4 >)(z 5 , 4>){z 2 , <t>)(z A , <p)(z 5 , <I>)(Z 3 , 1>)(Z 6 , v>}] 

= E[(Zi, </>)} x E[(Z 2 , (/)}} X E [{Z 3 , ip)} x E [(Z 4 , (p 2 )} x E [(Z 5 , cp 2 ip)} 
= E[(Zi, <p)} x E[(Z 2 , </>)] x E [(Z 3 , ip)} x E }(Z 5 , ip)}. 

A similar argument shows that 

E[(Yi, 0)<y 2 , <£>] E[<y 3 , ^>] = E[(^1, </>)] E[<z 2 , </»)] E[(Z 3 , ^>] E[<Z S , 

Thus 

E[(y lt 0)(y 2 , 0)<y 3 , ip)} - E[(y, <p)(Y 2 , <p)}e[(y 3 , ip)} = o. 

Writing all of the expectations in the last equation as sums in terms of the model 
probabilities p((U^ 6L ) gives a polynomial in the model probabilities of total 
degree 2 that is satisfied for all choices of the numerical parameters defining the 
root distribution and the substitution matrices. Thus we have found an invariant 
for this tree. 

Now consider the tree 5 




1 



3 



2 
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with the associated model for the nucleotides Y 1 ,Y 2 ,Y 3 exhibited by the taxa 
written in terms of independent G-valued random variables Z lt . . . , Z§ as follows: 

Y\ = Z\ + Z\ + Z§ 

Y -2 = Z<i + + Z^ 

Yo, = Z 3 Z 4 + Z5 

Now 

E[(Fi, 0)<y 2 , 4 >){Y 3 , ^>] -E[{Y U <t> )(Y 2 , 4 ,)] E[(y 3 , ^>] 

=e[(Zi, 0)] e[<z 2 , 0)] e[<z 3 , </>)] E[<z 4> h)\ e[<z 5 , V’)] 

- E[(Z 1; 4>)] E [<Z 2 , 0)] E[<Za, V’)] E[(Z 4 , 0)] E[(Z 4 , ^)] E[(Z 5 , V>>] 
=E[<Z 1) $]E[<Z 2 ,0>]E[<Z 3 ,^>] 

x (E[<Z 4 , H)} - E [<Z 4 , 0)] E[(Z 4 , V>)])E[(Z 5 , 4>)\. 

It is not hard to show that that the vector 

(E[(Z 4 , 0)], E[(Z 4 , V’)])E[(Z 4 , (j>if)]) 

ranges over a subset of K 3 with nonempty interior as the distribution of Z 4 
ranges over the set of possible distributions on G. Thus 

E[<Z 4 ,W>] -E[(Z 4 ^}]E[(Z 4 ,V>)] 

is certainly not identically 0 and the invariant we found for the previous tree is 
not an invariant for this tree. 

8. Finding All Invariants 

The examples studied in Section 7 indicate how we should proceed to find all 
the invariants for a general tree. The ideas that we describe in this section were 
developed in [ES93]. 

We call a vector (yy 15 ■ ■ ■ , Xt m ) £ G an allocation of characters to leaves. 
Such an allocation of characters to leaves induces an allocation of characters to 
vertices (yy, , . . . ,Xv n ) £ G as follows. The character \v is the product of the 
Xi for all leaves £ that are descendents of v, that is, 

xv ■= n w 

1>V 

In particular, if v = ty is a leaf (and hence the leaf £i by our numbering conven- 
tion), then Xvi = Xh- 
Let 



i = 4 m } 
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be an enumeration of the various allocations of characters to vertices induced by 
the 4 m different allocations of characters to leaves. Define 3 n vectors {x„ g = 
(x^g, . .. , x^g ^), v € V, 9 = (/>, ip, (pip} of dimension 4 m by setting 

x (i) := / 1 if Xij = ^ 
v ; ’ 8 ' \ 0 otherwise, 

for i = 1, . . . ,4 m , j = 1, . . . ,n and 9 £ {</>, ip, (pip}. 

Write TZ(T) for the free Z-module generated by the set {x t ,^ : v £ V, 9 = 
<p, ip, (pip}. That is, 1Z(T) is the collection of integer vectors of dimension 4 m 
consisting of Z-linear combinations of the x Vi g. Set 



tf(T) := | a G Z 4 " : ^ ai xf e = 0, v £ V, 9 = </>, ip, j , 

so that Z 4 " = K{ T) © 

For agZ 4 , the polynomial 

/ r m "|\ ai / v m 

n e n (Yj,Xij) - n e n< y ,.x«) 



{i:ai>0} x L j = 1 



{i:ai<0} x L j = 1 






{i:oi> 0} \ = l 









is an invariant if and only if a € Af(T). It is shown in [ES93] that this is 
the only game in town: all invariants arise from algebraic combinations and 
rearrangements of these basic invariants. 

Indeed, it is shown in [ES93] that if {(ai ;T -, . . . , a 4 ™ >r ), r = 1, . . . , rankA/ r (T)} 
is a Z-basis for the free Z-module 7V(T), then the set of polynomials of the form 






L 3 = 1 



n e - n e 






L i= i 



generates the ideal of invariants but no subset thereof does. Finding a Z-basis 
for Af( T) is just elementary linear algebra — we are simply finding a basis for 
the null space of an integer- valued matrix — and can be done using Gaussian 
elimination. 



9. How Many Invariants Are There? 

Given our tree T with m leaves (taxa) and n vertices in total, we have 4 m 
model probabilities l) that arise as polynomials in 3 n “free parame- 

ters” — 3 free parameters for the root distribution and 3 free parameters for 
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each of the substitution matrices. A naive “degrees of freedom” argument would 
suggest that there should, in some sense, be 4 m — 3 n independent relations be- 
tween the model probabilities. We verify this numerology in this section by 
showing that rankTvl(T) = 3 n, and hence rank A/" (T) = 4 m — 3 n. This and 
related results were presented in [EZ98], but our proof here is quite different. 

Let X denote the 4 m x 3n matrix with columns indexed by V x 
that has the column corresponding to (v,9), given by x V) 0 . We need to show 
that the matrix X has (real) rank 3 n, and this is equivalent to showing that the 
associated 3n x 3n Gram matrix X 4 X has full rank (see 0.4.6(d) of [HJ85].) 

The entry of X‘X with indices ((v*,0*),(v**,0**)), v*,v** G V, 9*, 9** G 
{4>, ip, ffytp}, is the usual scalar product of with x^..^.., which is just the 

number of assignments of characters to leaves that assign 9* to v* and 9** to 
v** . We can compute this number of assignments as follows. 

If v* = v** and 9* = 9**, then it is clear by symmetry that this entry is 4 m_1 , 
whereas if v* = v** and 9* yC 9**, then this entry is obviously 0. 

Consider now the case where v* yC v**, so that the collection of leaves de- 
scended from v* is not the same as the collection of leaves descended from v**. 
We claim that the entry of X*X with indices ((v*,9*), ( v**,9 **)) is 4 m ~ 2 . To 
see this, write L* and L** for the leaves descended from v* and v** , respectively. 
Suppose first that L** CL*. If we have an assignment of characters to leaves 
that assigns the characters rj* to v* and rj** to v**, then replacing the character 
assigned to some £* G L*\L** from y* (say) to p*p**rj*x* and replacing the 
character assigned to some £** G L** from %** (say) to p**r/**y** gives a new 
assignment of characters to leaves that assigns p* to v* and p** to v**. It follows 
that number of assignments of characters to leaves that assign 9* to v* and 9** 
to v** is indeed 4 m ~ 2 when L** CL*. A symmetric argument argument handles 
the case L* C L**, and we leave this to the reader. 

We conclude that X*X can be partitioned into 3x3 blocks so that the blocks 
down the diagonal are all of the form 

/ 4 m ~ 1 0 0 \ 

0 4 m ~ 1 0 I , 

V 0 0 4 m ~ 1 J 

while the off-diagonal blocks are all of the form 

( /^m— 2 2 

Z^rn— 2 z^rn—2 
Z^m— 2 z^m— 2 

Now 



4771—2 \ 
4 m-2 I 

4m—2 J 



X t X = 4 m - 2 (D + ll t ), 
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where 1 is the (column) vector with all entries equal to 1 and D is a matrix 
partitioned into 3x3 blocks with the blocks down the diagonal all of the form 




and the off-diagonal blocks all zero. Note that D is invertible with inverse a 
partitioned matrix that has blocks down the diagonal all of the form 




and the off-diagonal blocks all zero. A standard result on inverses of small rank 
perturbations (see 0.7.4 of [HJ85]) gives that X*X is indeed invertible (and hence 
full rank), with inverse 

4 -(m-2) ( D -i I D -1 11*D -1 ^ = 4" (m " 2) ( D" 1 — 11*) . 

V 1 + 1 4 D _1 1 ) V 1 + 3 n ) 

10. How Well Do Invariants Distinguish Between Trees? 

The last question remaining from Section 4 is, “Do different trees have differ- 
ent invariants?” The answer is “Yes.” This follows from Theorem 10 in [SSE93]. 
We give a different proof which actually establishes “how many” independent 
invariants distinguish between two different trees. 

We begin by making explicit the natural notion of equivalence for trees with 
labelled leaves. We say that two trees T' and T" with the same set L of leaves 
are identical if there is a bijection r from the set of vertices V' of T' to the set 
of vertices V" of T" such that t(£) = £ for each leaf £ € L and u £ V' is the 
father of v £ V' in T' if and only if t{u) £ V" is the father of t(v) £ V" in T". 
This is equivalent to requiring that r(£) = £ for each leaf £ € L and u £ V' is the 
ancestor of v £ V' in T' if and only if t{u) £ V" is the ancestor of t(v) £ V" in 
T". It is not hard to see that two trees T' and T" with the same set L of leaves 
are identical if and only if for each v' £ V' the set of leaves descended from v' is 
equal to the set of leaves descended from some v" £ V" and vice-versa. 

Given two trees T' and T" with the same set L of leaves, write iz(T',T") 
for the number of vertices v" of T" such that the collection of leaves descended 
from v" is not the collection of leaves descended from any vertex of T'. If T' 
and T" are not identical, then either ^(T',T") > 0 or ^(T",T') > 0. We claim 
that the rank of the free Z-module A?(T') fl7?.(T") is 3^(T', T"). That is, there 
are 3 jz(T',T") algebraically independent invariants for the tree T' that are not 
invariants for the tree T", and similarly with the roles of T' and T" interchanged. 
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To establish this claim, first note that 

rank (Af(T') n TZ(T")) = rank (1Z(T")) - rank (K( T') n TZ(T”)) 

= rank(72,(T') +H(T")) - rank (K{T')). 

Write V' and V" for the vertices of T' and T", respectively, and let V" denote 
the set of vertices v" of T" such that the collection of leaves descended from v" 
is not the collection of leaves descended from any vertex of T'. Hence \V"\ = 
^(T',T"). Of course, if v" G V"\V", then there is a vertex v' G V' such that 
the assignment of characters to v' and v" for each assignment of characters to 
leaves are the same, and hence the vector x„/ g (calculated for T') is the same as 
the vector ^ (calculated for T".) The claim will thus follow if we can show 
that the vectors 

{x j/ : v' G V', 6 = <j>, 'll), 4>xl)} U {x.„//, e : v" G V", 9 = </),ip, 4>4>} 

are linearly independent over the integers (equivalently, over the reals.) 

Let X denote the 4 m x 3(|V'| + |V"|) matrix obtained by putting together all 
these vectors — say indexing the columns by (V'UV") x {(/), ip, and making 
the column corresponding to (i>, 9) equal to x^g, for v G V' or v G V". We need 
to show that X has (real) rank 3(|V'| + |V"|), and this is equivalent to showing 
that the associated 3(|V'| + |V"|) x 3(|V'| + |V"|) Gram matrix X f X has full 
rank. An argument very similar to that in Section 9 completes the proof. 
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Diffuse Tomography as a Source of Challenging 
Nonlinear Inverse Problems for a General Class 

of Networks 

F. ALBERTO GRUNBAUM 



Abstract. Diffuse tomography refers to the use of probes in the infrared 
part of the energy spectrum to obtain images of highly scattering media. 
There are important potential medical applications and a host of diffi- 
cult mathematical issues in connection with this highly nonlinear inverse 
problem. Taking into account scattering gives a problem with many more 
unknowns, as well as pieces of data, than in the simpler linearized situa- 
tion. The aim of this paper is to show that in some very simplified discrete 
model, reckoning with scattering gives an inversion problem whose solution 
can be reduced to that of a finite number of linear inversion problems. We 
see here that at least for the model in question, the proportion of variables 
that can be solved for is higher in the nonlinear case than in the linear one. 
We also notice that this gives a highly nontrivial problem in what can be 
called network tomography. 



1. Introduction 

Optical, or diffuse, tomography, refers to the use of low energy probes to 
obtain images of highly scattering media. 

The main motivation for this line of work is, at present, the use of an infrared 
laser to obtain images of diagnostic value. There is a proposal to use this in 
neonatal clinics to measure oxygen content in the brains of premature babies 
as well as in the case of repeated mammography. With the discovery of highly 
specific markers that respond well in the optical or infrared region there are 
many potential applications of this emerging area; see [Al; A2]. 

There are a number of physically reasonable models that have been used in 
the formulation of the associated direct and inverse problems. These models are 
based on some approximation to a wave propagation model, such as the so-called 
diffusion approximation, or a transport equation model resulting in some type 
of linear Boltzmann equation. See [Al; A2; D; NW] for recent surveys of work 
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in this area. These papers give a detailed description of the physically relevant 
formulations that different authors have considered. 

Our Markov chain formulation, going back to [Gl; GP1; SGKZ], is different 
from those contained in these papers. We model the evolution of a photon as it 
moves through tissue by means of a Markov chain. At any (discrete) instant of 
time a photon occupies one of the states of the chain. These states are meant 
to represent a discretization of phase space, i.e. they encode position as well 
as velocity of a photon at a given time. The chain has three kinds of states: 
incoming states (which are meant to represent source positions surrounding the 
object of interest), hidden states (which are meant to represent the positions 
and velocities of photons inside the tissue) and finally, outgoing states ( which 
represent detectors surrounding the object). We should also add an absorbing 
state at the center of each pixel to indicate that a photon “entering the pixel” 
can die in it. Instead of adding these extra states we simply do not assume that 
the sum of the one-step transition probabilities from a state should add to one. 
The difference between one and this sum is the probability of being absorbed 
into the pixel in question when coming into it from the corresponding state. 

The direct problem would consist of determining different “input-output” 
quantities once the one-step transition probability matrix of our Markov chain 
has been given. 

The resulting inverse problem amounts to reconstructing the one-step transi- 
tion probability matrix for our Markov chain (with three kinds of states) from 
boundary measurements. This model is too simple and too general to faithfully 
reflect the physics of diffuse tomography but could be of interest in other set-ups. 
It gives a difficult class of nonlinear inverse problems for a certain general class 
of networks with a complex pattern of connections which are motivated by the 
diffuse tomography picture. 

Since our model is the result of a discretization both in the positions occupied 
by a photon as well as the direction in which it is moving, the states will be 
indicated below by arrows placed at the boundaries of each pixel and pointing in 
one of four possible directions. One of the smallest cases of interest in dimension 
two is this: 
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This simple model features four pixels, eight source positions, eight detector 
positions as well as eight hidden states. In this figure, incoming states are labeled 
by numbers enclosed in squares, outgoing states are labeled by numbers enclosed 
in circles, and hidden states are labeled by numbers enclosed in diamonds. The 
possible one step transitions are indicated in the next section, whereas the figure 
below displays (by means of arrows, as explained earlier) only the eight states of 
each kind. 

In [G4] a discussion can be found of the corresponding smallest case in di- 
mension three, where pixels are replaced by voxels and we have six different 
directions for our states. 

The physics, or what is left of it, is best compressed into a multiterminal 
network where the nodes are the states of our Markov chain and the oriented 
edges indicate one-step transitions (with unknown probabilities) between the 
corresponding nodes. This is what a probabilist would call a state diagram. 

As an example, here is the network corresponding to the physical model shown 
on the previous page (for clarity, when two nodes are joined by two opposite 
edges, we draw a single edge with arrows at both ends): 




Notice that there is an underlying linear dynamics governed by the (unknown) 
one-step transition probability matrix of our Markov chain, but the inversion 
problem of interest is still nonlinear. 
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A remarkable feature of this simple model is that, at least for systems arising 
from very coarse tomographic discretizations, it gives an exactly solvable system 
of nonlinear equations, i.e., a certain number of unknowns are expressible in 
terms of the data and a number of free parameters. The advantages of this 
rather uncommon situation are clear: for instance it is possible to go beyond 
iterative methods of solution, which are very common for nonlinear problems. 

In both the two-dimensional and three-dimensional situations we can consider 
as data the photon count for a source- detector pair which is defined as the proba- 
bility that a photon that started at the source in question emerges at the detector 
in question regardless of the number of steps involved. If we assume that every 
one-step transition takes one unit of time we can consider the time- of -flight as a 
random variable associated to each incoming-outgoing pair. The photon count 
is the moment of order zero of this collection of random variables. 

In Section 2 we see how far one can go using only the moment of order zero 
of time of flight. Section 3 considers the situation when we also use a small part 
of the information contained in the first moment of this collection of random 
variables. Section 4 deals with the issue of dealing with those variables that 
cannot be solved from the data. Finally Section 5 alludes to the fact that this 
same machinery can be applied in the non-physical situation when the dimension 
is neither two nor three but arbitrary. 

It is also instructive in each case to consider the standard tomographic linear 
problem when scattering is completely ignored and a photon can only be ab- 
sorbed in a pixel or continue in its straight-line trajectory. In this case each one 
of the four pixels, conveniently labeled (1, 1), (1, 2), (2, 1) and (2, 2) as the entries 
of a 2 x 2 matrix, is characterized by one parameter, its absorption probability. 

The results regarding the ratio between the number of variables we can solve 
for and the total number of unknowns for each one of these scenarios are given 
below. 

In the two-dimensional case, using four pixels (see figure on page 138) there are 
three situations: 

(1) The linear one where scattering is ignored, gives a problem with 4 unknowns 
and 4 pieces of data, of which only three are independent and allows one to 
solve for 3 out of 4 unknowns. 

(2) The general model discussed above (as in [GP1; GP2]) allows one to solve 
for 48 out of a total of 64 unknowns, leaving the ratio of | unchanged. 

(3) The use of time-of-flight information, which is discussed in Section 4, as well 
as in [G3], [GM1] gives a slightly better ratio, namely || = 

When this comparison is done in dimension three, with a total of eight voxels, 
we get three situations: 

(1) The linear version of the problem (scattering being ruled out) gives a system 
of 12 equations in 8 unknowns which can be solved for 7 of them in terms of 
one arbitrary parameter, giving a ratio of 
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(2) The general model (discussed in Sections 2 and 3) yields a system of 576 
nonlinear equations in 288 variables that can be solved for 240 of them, with 
a ratio of ||| = 

(3) The use of time-of-flight information (discussed in Section 4) raises the ratio 
to ||| = This shows that the consideration of a fully nonlinear problem 
can (in some sense) lead to a better determined problem than the correspond- 
ing linearized one. 

We do not consider here the important issues of the difficulty in solving these 
systems or the sensitivity to errors of the corresponding problem. 

For a very nice and up-to-date discussion of work in this area one can see 
[Al], [A2], [D], [NW]. These papers give a detailed description of the physically 
relevant formulations that different authors have considered. For an early refer- 
ence in the area of network tomography see [V]. For similar problems in an area 
of great practical interest see the recent article [CHNY], 

Remark This is an appropriate place to mention an oversight in [G4]. The 
labeling of the states given in the introduction to that paper does not correspond 
to the one used in [G4, Section 3] . The labeling used in the introduction to [G4] 
represents an improvement over the one used in [G4, Section 3]. The results in 
[G4] are correct, but some of the inversion formulas are unduly complicated since 
they are written down using a more complicated labeling scheme. When we use 
the labeling given in the introduction to [G4] we can reduce the entire problem 
to a set of equivalent linear ones, obviating the last nonlinear step in [G4]. This 
is reported in [GM2]. 

2. General Framework and Some Results 

The one-step transition probability matrix P is naturally broken up into blocks 
that connect different types of states. We denote by P/o the block dealing with 
a one-step transition from an arbitrary incoming state to an arbitrary outgoing 
state. Phh denotes the corresponding block connecting hidden to hidden states, 
Pjh the one connecting incoming to hidden states and finally Pho accounts for 
one-step transitions between hidden and outgoing states. For completeness we 
give these matrices below. 

0 N11S 0 0 0 0 N11E 0 

S21N 0 0 S21E 0 0 0 0 

W21N 0 0 W21E 0 0 0 0 

0 0 E22W 0 0 E22N 0 0 

0 0 S22W 0 0 S22N 0 0 

0 0 0 0 N12S 0 0 N12W 

0 0 0 0 E12S 0 0 E12W 

o wns oooo wiiE o 
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The choice of names for the variables in P is meant to indicate the corre- 
sponding transitions, for instance N11S means that we enter pixel (1,1) going 
north and exit it going south. It is convenient to refer to the figure on page 138 
at this point. 

Just as in [GP1], [GP2] we find it convenient to introduce matrices A , X, Y, 
W by means of 



A — Pro- 

P IO = XA~\ P hh = A~ X W, P IH = XA~ x W - Y. 

The transformation, for a given Pro, from the matrices Phh,Pio,Pih to the 
matrices W, X, Y was introduced by S. Patch in [P3] . Notice that from A, X, 
W and Y it is possible to recover (in that order) the matrices Pho> Pio> Phh 
and, finally, Pih- 

One advantage of introducing these matrices is that the input-output relation 
Qio = Pio + Pih{I ~ Phh)~ 1 Pho 
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can be rewritten, by multiplying both sides first by A on the right and then by 
(I — A _1 W) on the right again, in the form 

Q IO (A-W) = X-Y. 

In [GP1], [GP2] we exploited the block structure of the matrices A , W, X, Y 
to show that once Qw is given then A is arbitrary. After choosing A , it is then 
possible to derive explicit formulas for X, Y and W. 

In the three-dimensional case the situation is a bit better, although the equa- 
tions that we have to handle are naturally harder to deal with. We find that the 
matrix A can no longer be picked arbitrarily but only 2/3 of it is arbitrary. This 
means that using photon count alone it is possible to express 24 of the 72 entries 
in the matrix A in terms of the data and 48 free parameters in A. By the photon 
count matrix we refer to the matrix whose entries are given by the probabilities 
that a photon that starts at a given source position would emerge from the tissue 
at a specified detector position. For details consult [G4] and [GM2]. 

3. Using the First Moment of Time-of-Flight 

Now we go beyond the photon count and consider the first moment of the 
time-of- flight. As observed in the introduction the moment of order zero of 
this collection of random variables (one for each source-detector pair) gives the 
photon count matrix Qio- 
If we denote the expression 

Pih{I ~ Phh)~ 2 Pho 

by R, we have: 

Lemma. The first moment of the “time-of-flight” can be expressed as 

Qw + R- 

PROOF. Start from the observation that the j-tli moment of the time of flight 
is given by 

OO 

Qw = Pw + J2 PihPhhPho (k + 2) A (3-1) 

k = 0 

In particular, if j = 0 we recover (after an appropriate summation of the cor- 
responding geometric series) the expression for Qio = Qjq given in Section 2. 
We will return to this expression later in this section. 

For j = 1 we get 

<3/o = Pio + 2 Pih{I ~ Phh) 1 Pho + PihPhh(I — Phh)~ 2 Pho 
= Qw + Pih{I ~ Phh)~ 2 [I - Phh + Phh]Pho 
= Q ( il + R. 



□ 
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Since Qio is taken as data we can consider R as the extra information provided 
by the expected value of time of flight. 

Observe now that we have the relation 

QioA - X(A) = R(A - W(A)). 

This follows, for instance, by noticing that each side of this identity is given by 
Pih{I - Phh ) _1 - 

In the two-dimensional case ([GP2; GM1]) this concludes the job since we can 
use some of the entries of the matrix R to determine the ratios among eight pairs 
of the entries in A. Explicit formulas are given in [GM1] . 

The three-dimensional case has been given a first treatment in [G4] . By using 
the labeling mentioned in the introduction to that paper it is possible to obtain 
explicit formulas similar to those mentioned above. For details see [GM2]. 

It is very important to notice that in either dimension the entire problem 
of determining the blocks in P admits a natural “gauge transformation” given 
exactly by a diagonal matrix D. Consider the transformation that goes from a 
given set of blocks, to a new one given by the relations 

Pio = Pio, 

PlH = PihD~ 1 , 

Phh = DPhhD ^ 1 , 

Pfro = DP H o- 

Notice that this gauge transformation preserves the required block structure of 
all the matrices in question. Moreover the probability of going from an arbitrary 
incoming state to an arbitrary outgoing state in m steps, given by the matrix 
Pio if m = 1 and by PihPhh 2 Pho if m > 2, is clearly invariant under the 
transformation mentioned above. It follows then by referring to (3-1) for the 
j-th moment of the time of flight distribution that this is not affected by this 
gauge. 

In conclusion, we have shown that the zeroth and first moments of the time- 
of-fliglrt distribution determine the matrix P up to the choice of the arbitrary 
diagonal matrix D introduced above. 

4. Taking into Account a Physical Model 

An important question remains: how should the values of the 24 free param- 
eters be picked (or the 8 free parameters in dimension two)? A similar question 
was discussed in [GP2] where we considered the effect of imposing on our very 
general model the assumption of “microscopic reversibility”, i.e., a one-step tran- 
sition from a state (of our Markov chain) given by the vector v to a state given 
by the vector w has the same probability as a transition from the sates given 
by the vectors — w and — v respectively. On the other hand, in [G2], [GZ] we 
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considered the case of isotropic scattering. Each one of these cases leads to a 
dramatic reduction in the number of free parameters. 

It is tempting to make some of these simplifying assumptions at the very 
beginning of the process, thereby reducing the number of unknowns. Experience 
seems to indicate that the possibility of reducing the already nonlinear system of 
equations to a linear one is greatly enhanced by making use of these assumptions 
at the end of the process. 

5. A Network Tomography Problem for the Hypercube 

The two-dimensional and three-dimensional problems discussed above have a 
firm foundation in diffuse tomography. It is however possible to go to higher 
dimensions and consider the corresponding d- dimensional hypercube and the 
network that goes along with it. By using the techniques in [GM1] and [GM2] 
it is possible to see that by measuring the first two moments (zeroth and first) 
of time-of-flight we can determine everything explicitly up to a total of d 2 d free 
parameters. This happens to be the dimension of the gauge that appears at the 
end of Section 3, and thus this result is optimal. Details will appear in [GM3]. 

Acknowledgments. We thank the editors for useful suggestions on ways to 
improve the presentation. 
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An Invitation to Matrix- Valued Spherical 
Functions: Linearization of Products in the Case 
of Complex Projective Space /^(C) 

F. ALBERTO GRUNBAUM, INES PACHARONI, AND JUAN TIRAO 



Abstract. The classical (scalar-valued) theory of spherical functions, put 
forward by Cartan and others, unifies under one roof a number of exam- 
ples that were very well-known before the theory was formulated. These 
examples include special functions such as like Jacobi polynomials, Bessel 
functions, Laguerre polynomials, Hermite polynomials, Legendre functions, 
which had been workhorses in many areas of mathematical physics before 
the appearance of a unifying theory. These and other functions have found 
interesting applications in signal processing, including specific areas such 
as medical imaging. 

The theory of matrix- valued spherical functions is a natural extension of 
the well-known scalar-valued theory. Its historical development, however, 
is different: in this case the theory has gone ahead of the examples. The 
purpose of this article is to point to some examples and to interest readers 
in this new aspect in the world of special functions. 

We close with a remark connecting the functions described here with 
the theory of matrix-valued orthogonal polynomials. 



1. Introduction and Statement of Results 

The theory of matrix-valued spherical functions (see [GV; T]) gives a natural 
extension of the well-known theory for the scalar- valued case, see [He]. We start 
with a few remarks about the scalar-valued case. 

The classical (scalar-valued) theory of spherical functions (put forward by 
Cartan and others after him) allows one to unify under one roof a number of 
examples that were very well known before the theory was formulated. These ex- 
amples include many special functions like Jacobi polynomials, Bessel functions, 
Laguerre polynomials, Hermite polynomials, Legendre functions, etc. 



This paper is partially supported by NSF grants FD9971151 and 1-443964-21160 and by CON- 
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All these functions had “proved themselves” as the work-horse in many areas 
of mathematical physics before the appearance of a unifying theory. Many of 
these functions have found interesting applications in signal processing in gen- 
eral as well as in very specific areas like medical imaging. It suffices to recall, 
for instance, that Cormack’s approach [C] — for which he got the 1979 Nobel 
Prize in Medicine, along with G. Hounsfield — was based on classical orthogonal 
polynomials and that the work of Hammaker and Solmon [HS] as well as that of 
Logan and Slrepp [LS] is based on the use of Chebyclrev polynomials. 

The crucial property here is the fact that these functions satisfy the integral 
equation that characterizes spherical functions of a homogeneous space. For 
a review on some of these topics the reader can either look at some of the 
specialized books on the subject such as [He] or start from a more introductory 
approach as that given in either [DMcK] and [Tl, vol. I]. 

This integral equation is actually satisfied by all Gegenbauer polynomials and 
not only those corresponding to symmetric spaces. This point is fully exploited 
in [DG] where this property is put to use to show that different weight functions 
can be used in carrying out the usual tomographic operations of projection and 
backprojection. This works well for parallel beam tomography but has never 
been made to work for fan beam tomography because of a lack of an underlying 
group theoretical formulation in this case. For a number of issues in this area, 
including a number of open problems, see [G2]. 

For a variety of other applications of spherical functions one can look at 
[DMcK; Tl], 

We now come to the main issue in this article. 

The situation with the matrix-valued extension of this theory is entirely dif- 
ferent. In this case the theory has gone ahead of the examples and, in fact, to 
the best of our knowledge, the first examples involving nonscalar matrices have 
been given recently in [GPT1; GPT2; GPT3]. For scalar- valued instances of 
nontrivial type, see [HeSc] . 

The issue of how useful these functions may turn out to be as a tool in areas 
like geometry, mathematical physics, or signal processing in the broad sense is 
still open. From a historical perspective one could argue, rather tautologically, 
that the usefulness of the classical spherical functions rests on the many inter- 
esting properties they all share. With that goal in mind, it is natural to try to 
give a glimpse at these new objects and to illustrate some of their properties. 
The rather mixed character of the audience attending these lectures gives us an 
extra incentive to make this material accessible to people that might normally 
not look in the specialized literature. 

The purpose of this contribution is thus to present very briefly the essentials 
of the theory and to describe one example in some detail. This is not the appro- 
priate place for a complete description, and we refer the interested reader to the 
papers [GPT1; GPT2; GPT3] . 
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We hope to pique the curiosity of some readers by exploring the extent to 
which the property of “positive linearization of products” holds in the case of 
the spherical functions associated to P 2 (C). This result has been important 
in the scalar case, including its use in the proof of the Bieberbach conjecture, 
see [AAR]. The property in question is illustrated well by considering the case 
of Legendre polynomials: the product of any two such is expressed as a linear 
combination involving other Legendre polynomials with degrees ranging from 
the absolute value of the difference to the sum of the degrees of the two factors 
involved. Moreover, the coefficients in this expansion are positive. 

We should stress that the intriguing property described here is one enjoyed by 
a matrix-valued function put together from different spherical functions of a given 
type. In the classical scalar-valued case these two notions agree and the warning 
is not needed. This combination of spherical functions has already been seen, 
see [GPT1; GPT2; GPT3] to enjoy a natural form of the bispectral property. 
For an introduction to this expanding subject we could consult, for instance, 
[DG1; G12]. The roots of this problem are too long to trace in this short paper, 
but the reader may want to take a look at [SI]. For off-shoots that have yet to 
be explored further one can also see [G13; G15]. The short version of the story 
is that some remarkably useful algebraic properties that have surfaced first in 
signal processing and which one would like to extend and better understand have 
a long series of connections with other parts of mathematics. For a collection of 
problems arising in this area see [HK]. 

The issue of linearization of products, without insisting on any positivity 
results, plays (in the scalar-valued case) an important role in fairly successful 
applications of mathematics. For example, the issue of expressing the product 
of spherical harmonics of different degrees as a sum of spherical harmonics plays 
a substantial role in both theoretical and practical algorithms for the harmonic 
analysis of functions on the sphere. For some developments in this area see [DH] 
as well as [KMHR]. 

In the context of quantum mechanics this discussion is the backbone of the 
addition rule for angular momenta as can be seen in any textbook on the subject. 

In the last section we make a brief remark connecting the functions described 
here with the theory of matrix-valued orthogonal polynomials, as developed for 
instance in [D] and [DVA]. 



2. Matrix- Valued Spherical Functions 

Let G be a locally compact unimodular group and let K be a compact sub- 
group of G. Let K denote the set of all equivalence classes of complex finite 
dimensional irreducible representations of K\ for each 5 £ K, let fs denote the 
character of 6 , d(S) the degree of <5, i.e. the dimension of any representation in 
the class S, and xs = d(S)fs- 
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Given a homogeneous space G/K a zonal spherical function ([He]) p on G is 
a continuous complex valued function which satisfies <p(e) = 1 and 

l fi(x)ip(y) = / p{xky)dk x,y&G. (2-1) 

J K 

The following definition gives a fruitful generalization of this concept. 

Definition 2.1 [T; GV]. A spherical function $ on G of type 6 G K is a 
continuous function on G with values in End(H) such that 

(i) < 3 > (e) equals I, the identity transformation. 

(ii) < f>(a;) < i)(y) = f K xs(k~ 1 )§(xky) dk, for all x,y € G. 

The connection with differential equations of the group G comes from the prop- 
erty below. 

Let D(G) K denote the algebra of all left invariant differential operators on G 
which are also invariant under all right translation by elements in K . If (V. tt) 
is a finite dimensional irreducible representation of K in the equivalence class 
8 G K, a spherical function on G of type 8 is characterized by: 

(i) $ : G — > End(H) is analytic. 

(ii) <I>(fci gk 2 ) = 7r(fci)<I>(g)7r(fc 2 ), for all k\,k% € K, g £ G, and <f>(e) = /. 

(iii) [Dd>](g) = $(^)[D$](e), for all D G D{G ) K , g G G. 

We will be interested in the specific example given by the complex projective 
plane. This can be realized as the homogeneous space G/K, where G = SU(3) 
and K = S(U(2) x U(l)). In this case iii) above can be replaced by: [A 2 d>](c/) = 
A 2 <i>(g), [A 3 <I , ](g) = \ 3 $(g) for all g G G and for some A 2 ,A 3 G C. Here A 2 
and A 3 are two algebraically independent generators of the polynomial algebra 
D(G) g of all differential operators on G which are invariant under left and right 
multiplication by elements in G. / A 0 A 

The set K can be identified with the set Z x Z>o- If k = ( I, with 

A G U(2) and a = (det A) -1 , then ' 

t r(fc) = 7r n ,i(A) = (det A)” A 1 , 

where A 1 denotes the /-symmetric power of A, defines an irreducible representa- 
tion of K in the class (?i, Z) G Z x Z>o- 

For simplicity we restrict ourselves in this brief presentation to the case n > 0. 
The paper [GPT1] deals with the general case. The representation 7r nj ; of U(2) 
extends to a unique holomorpliic multiplicative map of M(2,C) into End(h)r) ) 
which we shall still denote by 7r nj ;. For any g G M(3,C), we shall denote by A(g) 
the left upper 2x2 block of g, i.e. 
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For any tt = 717 n ,i ) with n > 0 let : G — ► End(V^) be defined by 
‘J’tt (g) = ®n,i(g) = TT n ,i(A(g)). 

It happens that < F, r is a spherical function of type (n, /), one that will play a very 
important role in the construction of all the remaining spherical functions of the 
same type. 

Consider the open set 

A = { g € G : det A{g) yf 0 } . 

The group G = SU(3) acts in a natural way in the complex projective plane 
P 2 (C) . This action is transitive and K is the isotropy subgroup of the point 
(0, 0, 1) G P 2 (C) . Therefore ,P 2 (C) = G/K. We shall identify the complex plane 
C 2 with the affine plane { (x, y , 1) G P 2 (C) : ( x , y ) G C 2 }. 

The canonical projection p : G — > P 2 (C) maps the open dense subset A onto 
the affine plane C 2 . Observe that A is stable by left and right multiplication by 
elements in K. 

To determine all spherical functions $ : G — ► End(Vj,.) of type 7r = we 
use the function <!>„. introduced above in the following way: in the open set A 
we define a function H by 

H{g) = *(g) 

where <F is suppose to be a spherical function of type tt. Then H satisfies: 

(i) H{e) = I. 

(ii) H{gk ) = H(g), for all g G A, k G K. 

(iii) H(kg) = i:{k)P[{g)'rT{k^ 1 ), for all g G A, k G K . 

Property ii) says that H may be considered as a function on C 2 . 

The fact that <F is an eigenfunction of A 2 and A 3 makes H into an eigenfunc- 
tion of certain differential operators D and E on C 2 . 

We are interested in considering the differential operators D and E applied 
to a function H G C°°(C 2 ) ® End(K-) such that H{kp) = Tr(k)H(p)Tr(k)~ 1 , for 
all k G K and p in the affine complex plane C 2 . This property of H allows us to 
find ordinary differential operators D and E defined on the interval (0, 00 ) such 
that 

(D H) (r, 0) = (DH) (r) , (E H) (r, 0) = (EH) (r) , 

where H(r) = H(r, 0). 

Introduce the variable t = (1 + r 2 ) -1 , which converts the operators D and E 
into new operators D and E. 

The functions H turn out to be diagonalizable. Thus, in an appropriate basis 
of Vn, we can write H(r) = H(t) = (ho(t ), . . . , hi(t)). 

We find it very convenient to introduce two integer parameters w, k subject to 
the following three inequalities: 0 < w, 0 < k < l, which give a very convenient 
parametrization of the irreducible spherical functions of type ( n,l ). In fact, for 
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each pair ( l,n ), there are a total of / + 1 families of matrix-valued functions 
of t and w. In this instance these matrices are diagonal and one can put these 
diagonals together into a full matrix-valued function as we will do in the next two 
sections. It appears that this function, which coincides with the usual spherical 
function in the scalar case, enjoys some interesting properties. 

The reader can consult [GPT1] to find a fairly detailed description of the 
entries that make up the matrices mentioned up to now. A flavor of the results 
is given by the following statement. 

For a given l > 0, the spherical functions corresponding to the pair (l, n ) have 
components that are expressed in terms of generalized lrypergeometric functions 
of the form p + 2 Fp+i, namely 



p+2Fp+l 



a, 6, Si + 1, . . . , s p + 1 



c, Si, S 2 , ■ 



^mtk (1+dtP 






■ + d p j p )t J . 



3. The Bispectral Property 

For given nonnegative integers n, l and w consider the matrix whose rows are 
given by the vectors H(t) corresponding to the values k = 0, 1, 2, . . . , l discussed 
above. Denote the corresponding matrix by 

<f>(w, t). 

As a function of t, &(w,t) satisfies two differential equations 

D$(w, t)* = <t >(w, t) ( A, E$(w, tf = <F(u>, t)*M . 

Here A and M are diagonal matrices with 

A (i,i) = — w(w + n+i + l + 1) — (i— l)(n + i), 

M(i,i) = A(i,i)(n — l + 3i — 3) — 3(i— 1)(Z — i + 2)(n + i), 

for 1 < i < l + 1; D and E are the differential operators introduced earlier. 
Moreover we have 

Theorem 3.1. There exist matrices A w , B Wl C w , independent oft , such that 

A w $(w — l,t) + B w $(w,t) + C w $(w + l,t) = tA(w,t) ■ 

The matrices A w and C w consist of two diagonals each and B w is tridiagonal. 
Assume, for convenience, that these vectors are normalized in such a way that 
for t = 1 the matrix 1) consists of all ones. 

For details on these matrices as well as for a full proof of this statement, which 
was conjectured in [GPT1], the reader can consult [GPT2] and [PT]. 
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4. Linearization of Products 



The property in question states that the product of members of certain families 
of (scalar-valued) orthogonal polynomials is given by an expansion of the form 



j+i 

PiPj = ^ CLkPk 

k=\j-i\ 



and that the coefficients in the expansion are all nonnegative. 

For a nice and detailed account of the situation in the scalar case, see for 
instance [A], [S]. Very important contributions on these and related matters are 
[G] and [K], 

It is important to note that the property in question is not true for all families 
of orthogonal polynomials, in fact it is not even true for all Jacobi polynomials 
P^'^\ normalized by the condition P^’^\ 1) positive. For our purpose it is 
important to recall that nonnegativity is satisfied if a > (3 and a + /3 > 1. 

The case 1 = 0, n > 1. 

From [GPT1] we know that when l = 0 and n > 0 the appropriate eigenfunc- 
tions (without the standard normalization) are given by 



®(w,t) = 2 Fi 



—w, w + n + 2 
n+1 



This means that with the usual convention that the Jacobi polynomials are 
positive for t = 1 we are dealing with the family 

P^ n \t). 

If n = 0 or n = 1 the family Pw ’ n meets the sufficient conditions for nonneg- 
ativity given above. For n = 0 the coefficients a*, are all strictly positive; in the 
case n = 1 the coefficients a\i-j\ +2 k> are strictly positive while the coefficients 
au_ji_|_fc, k odd, are zero, as the example below illustrates. 

We now turn our attention to the case n > 1. 



Conjecture 4.1. For n an integer larger than one, the coefficients in the 
expansion for the product PiPj above alternate in sign. 

This conjecture is backed up by extensive experiments, one of which is shown 
below. It deals with the case of w (that is, i and j) equal to 3 and 4. Richard 
Askey supplied a proof of this conjecture. This gives us a new chance to thank 
him for many years of encouragement and help. 

The product of the (scalar-valued, and properly normalized) functions 4>(3,f) 
and 4>(4, t) is given by the expansion 

4>(3,t)4>(4,t) = ai4(M) + a 2 4>(2,f) + a 3 <I>(3,f) + a4<I>(4,f) 

+a 5 4)(5, t) + a 6 4>(6, t) + a7<f>(7, t), 
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with coefficients given by the expressions 
(n + 2)(n + 3)(n + 4) 



a i = 



o 2 = 



a 3 = 



04 = 



a 5 = 



a e = 



a 7 — 



(n + 8)(n + 9)(n+10) ’ 

6(n — l)(n + 3)(?r + 4)(n + 6) 2 
(n + 7)(n + 8)(n + 9)(n+10)(n+ll) ’ 

3(n • + 4) (n + 5) (7n 3 + 52n 2 + 67 n +162) 

(n + 7)(n + 9)(n + 10)(n+ll)(n+12) 

4(n — 1) (n + 6) (lln 3 + 123n 2 + 436n + 648) 

(n + 8)(n + 9)(n+ll)(n+12)(n + 13) 

3 (n + 5) (n + 6) (n + 7) (19n 3 + 155n 2 + 162n + 504) 
(n + 8)(n + 9)(n+10)(n + ll)(n+13)(n+14) 

42(n — l)(n + 5)(n + 6) 2 (n + 7)(n + 8) 

(n + 9)(n+10) (n+ll)(n+12)(n + 13)(n+15) ’ 

14(n + 5)(n + 6) 2 (n + 7) 2 (n + 8) 
(n+10)(n+ll)(n+12)(n+13)(n+14)(n+15). 



This shows that even in the scalar-valued case, as soon as we are dealing 
with nonclassical spherical functions we encounter an interesting sign alternating 
property that is quite different from the more familiar case. Here and below we 
see that things become different once n is an integer larger than one. 

Now we explore the picture in the case of general l. 

The case l > 0, n > 1 

Conjecture 4.2. If i < j then the product of <!>(*, t) and 4>(j, t) allows for a 
{ unique ) expansion of the form 

j+i+l 

= ^2 A k $(k,t). 

Here the coefficients A k are matrices and the matrix-valued function < t>(w,t) is 
the one introduced in Section 3. This conjecture holds for all nonnegative n and 
is well known for l = 0 and n = 0. 

In the case of l = 0 we obtain the usual range in the expansion coefficients 
ranging from j — i to j + * as in the case of addition of angular momenta. For 
larger values of l we see that extra terms appear at each end of the expansion. 

Conjecture 4.3. If i < j then the coefficients A k in the expansion 

j+i+l 

A k $(k,t). 
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with k in the range j — i,j + i have what we propose to call ‘‘the hook alternating 
property. ” 

We will explain this conjecture by displaying one example. First notice that we 
exclude those coefficients that are not in the traditional or usual range discussed 
above. 

At this point it may be appropriate in the name of truth in advertisement 
to admit that we have no concrete evidence of the significance of the property 
alluded to above and displayed towards the end of the paper. We trust that the 
reader will find the property cute and intriguing. It would be very disappointing 
if nobody were to find some use for it. 

The results illustrated below have been checked for many values of l > 0, but 
are displayed here for 1 = 1 only. 

Recall that from [GPT1] the rows that make up the matrix- valued function 
H(t,w) are given as follows: the first row is obtained from the column vector 



m = 



—w, w + n + 3, A — n 
n + 2, A — n — 1 



—w, w + n + 3 



A = — w(«; + n + 3) 

and the second row comes from the column vector 



m = 



—w, w + n + 4 



, Ll \ F f-w-l, w + n + 3, A 
- ( " + 1)3i M „ + l, A-l ;f 



A = —w(w + n + 4) — n — 2. 

The product of the matrices 4>(2,f) and 4>(6,t) is given by the expansion 

$(2,t)$(6,i) = A 3 $(3,f) + A 4 $(4,t) + A 5 $(5,f)+A 6 $(6,t) 

+A7<I>(7 , t) + A 8 $(8, f) + Ag4>(9, t ), 

where 



1 6 (n + 4) (n + 5) (n + 6 ) 2 (n + 7) 2 
(n+ll)(n + 12)(n+13)(n+14)(n+15)(n + 16) 



L n Li2 

L21 L22 
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15(?i + 5) 2 (?j + 6)(n + 8) 

2(n+12)(n + 13)(n+14)(n+15)’ 

5(n + 5) (?H- 6) (4 n 2 + 55 n + 216) 

6(n + 13)(n+14)(n+15) (n+16) ’ 

(?i + 5)(n + 6)(?i + 7)(8n 2 + 153n + 724) 

2(n+12)(n + 13)(n+14)(n+15)(n+16)’ 

5(n + 6) (n + 7) (248 n 4 + 4665 n 3 + 27202 n 2 + 45137 n - 23252) 
12(n+ll)(n+13)(n+14)(n + 15)(?z+16)(n+17) 



As 



/Mn 

V Af 2 i 



Mi 2 \ 
-^22 / 



with 



(ra + 5)(n + 6) (185 n 3 + 3284 n 2 + 15732 n+ 10368) 

6(n + 7) (n+12)(n + 14)(n + 15)(?i+16) 

(n + 5) (85 n 4 + 1817 n 3 + 11380 n 2 + 7072 n - 93460) 

7(n + 7)(n+13)(n + 15)(n+16)(n+17) ’ 

(n + 6) 2 (170 n 4 + 4735 n 3 + 42068 n 2 + 99767 n - 168628) 

12(n + 7) (n + 12) (n + 14) (n + 15) (n + 16) (n + 17) ’ 

4327 n 7 + 163698 ?i 6 + 2480127 n 5 + 19091004 n 4 + 78090428 n 3 

+163454544 ?z 2 + 172290528 n + 132098688 

14(n + 7) (n + 12) {n + 13) {n + 15) (n + 16) (n + 17) (n + 18) 



^6 



(Nu 

\N 21 



n 12 \ 

N22 ) 



with 



2(193 n 5 + 5832 n 4 + 65284 n 3 + 328884 n 2 + 727621 n + 634422) 

7(n + 8)(n+ 13)(n + 14)(n + 16)(n+ 17) ’ 

171 n 5 + 4729 n 4 + 45764 n 3 + 188570 n 2 + 442336 n + 1133640 
8(n + 8)(n+14)(n+15)(n+17)(n+18) ’ 

171 n 6 + 7071 n 5 + 1 16213 n 4 + 959879 n 3 + 4245034 n 2 + 10640548 n + 15755112 
7(n + 8)(n+ 13)(n+ 14)(n+ 16)(n + 17)(n + 18) 

4269 n 7 + 169934 n 6 + 2677678 n 5 + 21066480 n 4 + 85737209 n 3 

+169428298 n 2 + 129986220 n - 46794888 

8(n + 8)(n+13)(n+14)(n+15)(n + 17)(n+18)(n+19) 



A 7 




Pl2\ 

P22) 



with 
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3(n + 5) (129 n 4 + 3710 n 3 + 36430 n 2 + 129960 n + 76536) 

8(n + 9)(n + 14) (n + 15) (n + 16) (n + 18) 

(n + 5)(n + 10)(57 n 3 + 917 n 2 + 2274n— 11268) 

3(n + 9)(n + 15)(n+16)(n + 17)(n+19) ’ 

-3(57 n 6 + 2505 n 5 + 44489 n 4 + 389955 n 3 + 1576582 n 2 + 1465908 n - 4434696) 
8(n + 9)(n + 14) (n + 15) (n+ 16) (n + 18) (n + 19) 

2(n + 10) (829 n 6 + 27979 n 5 + 352571 n 4 + 2024521 n 3 

+5197384 n 2 + 5712396 n + 5004720) 
3(n + 9) (n + 14) (n + 15) (n + 16) (n + 17) (n + 19) (n + 20) ’ 



^8 



/ Qn 
\ Q21 



Q 12 \ 
Q22 / 



with 



Q11 



Ql2 



Q21 



Q 22 



5(n + 5)(n + 6)(21n 2 + 401n + 1920) 

6(n+15)(n+16)(n+17)(n + 18) ’ 

15(n + 5)(?i + 6)(n + 8)(?i+ll) 

2(?i+16)(n+17)(n+18)(n + 19) ’ 

5(n + 6) (10 ?i 4 + 329 n 3 + 4942 n 2 + 36611 n + 96300) 
6(n+15)(n+16)(n + 17)(n+18)(n + 20) ’ 

3(n + 6) (n + 11) (430 n 4 + 9773 n 3 + 67728 n 2 + 129129 n - 59220) 
4 (n + 15) (n + 16) (n + 17) (n + 18) (n + 19) (n + 21) 



Ag = ( r rf! ) with 
\ i21 J-22 / 

T _ 99(n + 4)(n + 6)(n + 7)(n + 10) 

21 “ 4(n + 16)(n + 17)(n + 18)(n+19)(n + 20)’ 

165(n + 4)(n + 6)(n + 7)(n + 8)(n + 10)(n+12) 

22 “ 2(n + 16) (n + 17) (n + 18) (n + 19) (n + 20) (n + 21) ' 

Notice that if we concentrate our attention on the coefficients within the 
traditional range we see that the first matrix A 4 has its first hook made up of 
positive entries, the second hook (which in this example consists of only one 
entry) has negative signs. The second matrix As has its first hook negative, the 
second hook positive. The third matrix Ag repeats the behavior of the first one, 
the fourth one A7 imitates the second one, and so on. 

Extensive experimentation shows that this double alternating property holds 
for values of l greater than zero. For coefficient matrices in the traditional 
expansion range, the first matrix has its first hook positive, the second one 
negative, the third one positive, etc. The second matrix has the same alternating 
pattern of signs for the hooks but its first hook is negative. The third matrix 
imitates the first, etc. 
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The following picture captures the phenomenon described above for n larger 
than one and when the index k is in the traditional range. 

+ ++ ••• + 





etc. 



5. The Relation with Matrix- Valued Orthogonal Polynomials 

We close the paper remarking, once again, that our matrix-valued spherical 
functions are orthogonal with respect to a nice inner product and have polyno- 
mial entries. Yet, they do not fit directly into the existing theory of matrix- valued 
orthogonal polynomials as given for instance in [D] and [DVA]. 

It is however possible to establish such a connection: define the matrix-valued 
function T (j, t) by means of the relation 

$(. i,t) = 

It is now a direct consequence of the definitions that the family T (j, t) satisfies 
all the standard requirements in [DVA] and not only satisfies a three term recur- 
sion relation but also T(j, t) 1 satisfies a fixed differential equation with matrix 
coefficients and only the “eigenvalue matrix” depends on j. In other words the 
family T (j, t ) meets all the conditions given at the beginning of Section 3 and 
meets also the conditions of the standard theory in [DVA] giving an example 
of a classical family of matrix-valued orthogonal polynomials. In particular, the 
coefficients in the differential operator D (obtained by conjugation from the one 
in [GPT1]) are matrix polynomials of degree going with the order of differenti- 
ation. For a nice introduction to this circle of ideas, see the pioneering work in 
[D], 

Acknowledgments. We are much indebted to the editors for suggesting a 
number of places where the exposition could be improved. Griinbaum acknowl- 
edges a useful conversation with A. Duran that steered him in the direction to 
Section 5 above. 
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Image Registration for MRI 

PETER J. KOSTELEC AND SENTHIL PERIASWAMY 



Abstract. To register two images means to align them so that common 
features overlap and differences — for example, a tumor that has grown — 
are readily apparent. Being able to easily spot differences between two 
images is obviously very important in applications. This paper is an intro- 
duction to image registration as applied to medical imaging. We first define 
image registration, breaking the problem down into its constituent compo- 
nent. We then discuss various techniques, reflecting different choices that 
can be made in developing an image registration technique. We conclude 
with a brief discussion. 



1. Introduction 

1.1. Background. To register two images means to align them, so that com- 
mon features overlap and differences, should there be any, between the two are 
emphasized and readily visible to the naked eye. We refer to the process of 
aligning two images as image registration. 

There are a host of clinical applications requiring image registration. For 
example, one would like to compare two Computed Tomography (CT) scans 
of a patient, taken say six months ago and yesterday, and identify differences 
between the two, e.g., the growth of a tumor during the intervening six months 
(Figure 1). One could also want to align Positron Emission Tomography (PET) 
data to an MR image, so as to help identify the anatomic location of certain 
mental activation [43]. And one may want to register lung surfaces in chest 
Computed Tomography (CT) scans for lung cancer screening [7]. While all 
of these identifications can be done in the radiologist’s head, the possibility 
always exists that small, but critical, features could be missed. Also, beyond 
identification itself, the extent of alignment required could provide important 
quantitative information, e.g., how much a tumor’s volume has changed. 
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Figure 1 . Two CT images showing a pelvic tumor's growth over time. The 
grayscale has been adjusted so as to make the tumor, the darker gray area within 
the mass in the center of each image, more readily visible. In actuality, it is 
barely darker than the background tissue. 



When registering images, we are determining a geometric transformation 
which aligns one image to fit another. For a number of reasons, simple im- 
age subtraction does not work. MR. image volumes are acquired one slice at a 
time. When comparing a six month old MR. volume with one acquired yesterday, 
chances are that the slices (or “imaging planes”) from the two volumes are not 
parallel. As a result, the perspectives would be different. By this, we mean the 
following. Consider a right cylindrical cone. A plane slicing through the cone, 
parallel to its base, forms a circle. If the slice is slightly off parallel, an ellipse 
results. In terms of human anatomy, a circular feature in the first slice appears 
as an ellipse in the second. In the case of mammography, tissue is compressed 
differently from one exam to the next. Other architectural distortions are possi- 
ble. Since the body is an elastic structure, how it is oriented in gravity induces a 
variety of non-rigid deformations. These are just some of the reasons why simple 
image subtraction does not work. 

For the neuroscientist doing research in functional Magnetic Resonance Imag- 
ing (fMRI), the ability to accurately align image volumes is of vital importance. 
Their results acutely depend on accurate registration. To provide a brief back- 
ground, to “do” fMRI means to attempt to determine which parts of the brain 
are active in response to some given stimulus. For instance, the human subject, 
in the MR scanner, would be asked to perform some task, e.g., finger-tap at 
regular intervals, or attend to a particular instrument while listening to a piece 
of music [20], or count the number of occurrences of a particular color when 
shown a collection of colored squares [8]. As the subject performs the task, the 
researcher effectively takes 3-D MR movies of the subject’s brain. The goal is to 
identify those parts of the brain responsible for processing the information the 
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Figure 2. fMRI. By registering the frames in the MR “movie" and performing 
statistical analyses, the researcher can identify the active part(s) of the brain 
by finding those pixels whose intensities change most in response to the given 
stimulus. The active pixels are usually false-coloured in some fashion, to make 
them more obvious, similar to those shown in this figure. 

stimulus provides. The researcher’s hope of accomplishing this is based on the 
Blood Oxygenation Level Dependent (BOLD) hypothesis (see [6]). 

The BOLD hypothesis roughly states that the parts of the brain that process 
information, in response to some stimulus, need more oxygen than those parts 
which do not. Changes in the blood oxygen level manifest themselves as changes 
in the strength of the MR. signal. This is what the researcher attempts to detect 
and measure. The challenge lies in the fact that the changes in signal strength are 
very small, on the order of only a few percent greater than background noise [5] . 
And to make matters worse, the subject, despite their noblest intentions, cannot 
help but move at least ever so slightly during the experiment. So, before useful 
analysis can begin, the signal strength must be maximized. 

This is accomplished by task repetition, i.e. , having the subjects repeat the 
task over and over again. Then all the image volumes are registered within each 
subject. Assuming gaussian noise, adding the registered images will strengthen 
the elusive signal. Statistical analyses are done within subject, and then com- 
bined across all subjects. This is the usual order of events [18]. 

1.2. What’s inside this paper. This will be a whirlwind, and by no means 
exhaustive, tour of image registration for MRI. We will briefly touch upon a few 
of the many and varied techniques used to register MR images. Note that the 
survey articles by Brown [11] and Van den Elsen [38] are excellent sources for 
more in-depth discussion of image registration, the problem and the techniques. 
Our purpose here, within this paper, is to whet the reader’s appetite, to stimulate 
her interest in this very important image processing challenge, a challenge which 
has a host of applications, both in medical imaging and beyond. 
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The paper is organized as follows. We first give some background and estab- 
lish a theoretical framework that will provide a means of defining the critical 
components involved in image registration. This will enable us to identify those 
issues which need to be addressed when performing image registration. This will 
be followed by examples of various registration techniques, explained at varying 
depths. The methods presented are not meant to represent any sort of definitive 
list. We want to point out to the reader just some of the techniques which exist, 
so that they can appreciate how difficult the problem of image registration is, as 
well as how varied the solutions can be. We close with a brief discussion. 

Acknowledgments. We thank Daniel Rockmore and Dennis Healy for inviting 
us to participate in the MSRI Summer Graduate Program in Modern Signal 
Processing, June 2001. We also thank Digger ‘The Boy’ Rockmore for helpful 
discussions, and for granting us the use of his image in this paper. 



2. Theory 

Suppose we have two brain MR images, taken of the same subject, but at 
different times, say, six months ago and yesterday. We need to align the six month 
old image, which we will call the source image , with the one acquired yesterday, 
the target image. (These terms will be used throughout this paper.) A tumor 
has been previously identified, and the radiologist would like to determine how 
much the tumor has grown during the six weeks. Instead of trying to “eyeball 
it,” the two images would enable an quantitative estimate of the growth rate. 
How do we proceed? 

Do we assume that a simple rigid motion will suffice? Determining the correct 
rotation and translation parameters is, as we will see later, a relatively quick 
and straightforward process. However, if non-linear deformations have occurred 
within the brain (which, as described in Sec. 1.1, is likely for any number of 
reasons), applying a rigid motion model in this situation will not produce an 
optimal alignment. So probably some sort of non-rigid or elastic model would 
be more appropriate. 

Are we looking to perform a global alignment, or a local one? That is, will 
the same transformation, e.g., affine, rigid body, be applied to the entire image, 
or should we instead employ a local model of sorts, where different parts of the 
image/volume are moved in different, though smoothly connected, ways? 

Should the method we use depend on active participation by the radiologist, 
to help “prime” or “guide” the method so that accurate alignment is achieved? 
Or do we instead want the technique to be completely automated and free of 
human intervention? 

Wow, that’s a lot of questions we have to think about, and answer, too. How 
do we begin? To tackle the alignment problem, we had first better organize it. 
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2.1. The four components. The multitude of challenges inherent in perform- 
ing image registration can be better addressed by distilling the problem into four 
distinctive components [11]. 

I. The feature space. Before registering two images, we must decide exactly what 
it is that will be registered. The type of algorithm developed depends critically 
on the features chosen. And when you think about it, there are alot of features 
from which to choose. Will we work with the raw pixel intensities themselves? 
Or perhaps the edges and contours of the images? If we have volumetric data, 
perhaps we should use the surface the volume defines, as in a 3-D brain scan? 
We could have the user identify features common to both images, with the intent 
to aligning those landmarks. Then again, if we wish to align images of different 
modalities, say MRI with PET, then perhaps statistical properties of the images 
that would be optimal for our purpose. So you see, the feature space we choose 
will really drive the algorithm we develop. 

II. The search space. When one says, “I want to align these two images,” what 
is one really saying? That is, what is the rigorous form of the sentence? The 
two images can be considered samples of two (unknown), compact, real- valued 
functions, /(x), g(x), defined on R” (where n is 2 or 3). To align the images 
means we wish to find a transformation T(x) such that /(x) = g(T(x)) for all 
x. Fine. So what kind of transformation are we willing to consider? This is the 
Search Space we need to define. 

For example, you can consider the simple rigid body transformations, rotation 
plus translation. Or, if you would like to account for differences in scale, you may 
instead decide to search for the best affine transformation. But both of these 
transformations are global in some respect, and you may want to do something 
more localized or elastic, and transform different parts of the image by differing 
amounts, e.g., to account for non-uniform deformations. Your decision here will 
very much influence the nature of the registration algorithm. 

III. The search strategy. Suppose we have chosen our Search Space. We select 
a transformation 2o(x) and try it. Based on the results of 2o(x), how should 
we choose the next transformation, Ti(x), to try? There are any number of 
ways: Linear Programming techniques; a relaxation method; some sort of energy 
minimization. 

IV. The similarity metric. This ties in with the Search Strategy. When compar- 
ing the new transformation with the old, we need to quantify the differences be- 
tween the geometrically transformed source image with the target image. That 
is, we need to measure how well /(x) compares with g(T(x)). Using mean- 
squared error might be the suitable choice. Or perhaps correlation is the key. 
Our choice will depend on many factors, such as whether or not the two images 
are of the same modality. 
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So once these choices are made, our search for an optimal transformation, one 
that aligns the source image with the target, continues until we find one that 
makes us happy. 



3. A Potpourri of Methods 

Given the content in Section 2, the reader can well believe that there are 
a multitude of registration methods possible, each resulting from a particular 
choice of feature and search spaces, search strategy, and similarity metric But 
always bear in mind that there is no single right registration algorithm. Each 
technique has its own strengths and weaknesses. It all depends on what you 
want. 

Very broadly speaking, registration techniques may be divided into two cat- 
egories, rigid and nonrigid. Some examples of Rigid registration techniques in- 
clude: Principal Axes [2], Correlation-based methods [12], Cubic B-Splines [37], 
and Procrustes [19; 34]. For Non-Rigid techniques, there are Spline Warps [9], 
Viscous Fluid Models [13], and Optic Flow Fields [30]. 

The survey articles [11; 38] mentioned previously go into some of these tech- 
niques in greater depth. Now, to begin our “If it’s Tuesday, this must be Bel- 
gium” tour of MR image registration techniques. 

3.1. Principal Axes. We begin with the Principal Axes algorithm (e.g., 
see [2]). To summarize its properties, based on the classification scheme of 
Section 2.1, the feature space the algorithm acts upon effectively consists of 
the features of the images, such as edges, corners, and the like. The search 
space consists of global translations and rotations. The search strategy is not 
so much a “search,” as we are finding the closed formed solution based on the 
eigenvalue decomposition of a certain covariance matrix. The similarity metric is 
the variance of the projection of the feature’s location vector onto the principal 
axis. 

The algorithm is based on the straightforward and powerful observation that 
the head is shaped like an ellipse/ellipsoid (depending on the dimension). For 
purposes of image registration, the critical features of an ellipse are its center 
of mass, and principal orientations, i.e., major and minor axes. Using these 
properties, one can derive a straightforward alignment algorithm which can au- 
tomatically and quickly determine a rotation + translation that aligns the source 
image to the target. 

Let I denote the 2-D array representing an image, with pixel intensity I(x,y) 
at location (x, y). The center of mass, or centroid, is 

„ _ Ex, « x J (u v) , _ Ex, s y j (a, y ) 

x Y. x , v i{*,y) y Y. x , v i{*,y) ' 
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Figure 3. Principal axes. The eigenvectors E and e, corresponding to the largest 
and smallest eigenvalues, respectively, indicate the directions of the major and 
minor axes, respectively. 

With the centroid in hand, we form the covariance matrix 

C = f Cl1 Cl2 V 

V C 21 c 22 J 

where 

di = ^2(x-x) 2 I(x,y), 

*,y 

C22 = (y-y) 2 1 ( x ^y)^ 

C12 = ^2(x-x)(y-y)I(x,y), 

x,y 

C21 = C 12. 

The eigenvectors of C corresponding to the largest and smallest eigenvalues 
indicate the direction of the major and minor axes of the ellipse, respectively. 
See Figure 3. 

The principal axes algorithm may be described as follows. First, calculate 
the centroid, and eigenvectors of the source and target images via an eigenvalue 
decomposition of the covariance matrices. Next, align the centers of mass via a 
translation. Next, for each image determine the angle a (Figure 3) the maximal 
eigenvector forms with the horizontal axis, and rotate the test image about its 
center by the difference in angles. The images are now aligned. 

Figure 4 shows the procedure in action. In this example, the target image is 
a rotated version of the source image, with a small block missing. Subtracting 
the target from the aligned source renders the missing data quite apparent. 

While the principal axes algorithm is easy to implement, it does have the 
shortcoming that it is sensitive to missing data. As an exaggerated example, 
suppose the target MR. image covers the entire head, while the source MR image 
has only the top half, say from the eyes on up. In this case, the anatomical 
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Figure 4. Principal axes: aligning axial images. The difference between the 
aligned source and target images is easily apparent in the far right panel. 



feature located at the centroid of the source image will differ from the anatomical 
feature located at the centroid of the target. However, be that as it may, one 
can certainly use the algorithm to provide a coarse approximation to “truth.” 
That is, one may use rotation + translation parameters as “seed” values for more 
accurate methods. 

3.2. Fourier-based correlation. Fourier-based Correlation is another method 
for performing rigid alignment of images. The feature space it uses consists of 
all the pixels in the image, and its search space covers all global translations and 
rotations. (It can also be used to find local translations and rotations [31].) As 
the name implies, the search strategy are the closed form Fourier-based meth- 
ods, and the similarity metric is correlation, and its variants, e.g., phase only 
correlation [12]. As with Principal Axes, it is an automatic procedure by which 
two images may be rigidly aligned. Furthermore, it is an efficient algorithm, 
courtesy of the FFT [12]. 

The algorithm may be described as follows. Let f(x, y) and g(x, y) denote 
the source and target images, respectively. Uppercase letters will denote the 
function’s Fourier transform (FT): 

FT FT 

f{x,y) •++ F(u x ,u y ), g(x,y) -++ G(u x ,u y ). 

To clarify, (x,y) denote coordinates in the spatial domain, and (o o x ,u> y ) denote 
coordinates in the frequency domain. Suppose the source and target are related 
by a translation (a, b) and rotation 9: 

f{x,y) = g((xcosO + ysinO) —a, (— x sin 0 + y cos 0) — b ). 

Then, using properties of the Fourier transform, we have 

F(iu x ,ujy) = e~ l ( auix+bujy ^G(u) x cos9 + UySmO, —uj x sin9+uj y cos6). 

By taking norms and obtaining the power spectrum, all evidence of translation 
by (a, b) has disappeared: 

\F(lo x , lo v )\ 2 = \G(cj x cos9 + uj y sm6, — uj x sin 0 + u y cos9) | . 
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Power Spectrum 

cartesian coordinates polar coordinates 




Figure 5. By considering the power spectra, translations vanish. Furthermore, 
in polar coordinates, rotations become translations. 

Note that rotating g{x , y) by 6 in the spatial domain is equivalent to rotating 
\G(ui x , LOy) | 2 by that amount in the frequency domain. By switching to polar 
coordinates (setting x = r cos ij>, y = rsin^), we have 

\F(r,n 2 = \G(r,^~d)\ 2 

and hence rotation in the cartesian plane becomes translation in the polar plane. 
See Figure 5. 

We are now in a position to give an outline for the Fourier-based correlation 
method of image registration: 

1. Take the discrete Fourier transform of the source image f(x) and target image 
g(x). 

2. Next, send the power spectra to polar coordinates land: 

I^XgVOI 2 = \G{r, i>~e)\ 2 . 

3. Use your favourite correlation technique to determine the rotation angle. 
(Note that this is strictly a translation problem.) And then rotate the source 
image (which is in the spatial domain) by that amount. 

4. Use your favourite correlation to now determine the translation amount in 
the spatial domain, between the (so far) only-rotated source image, and the 
target image. 
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Pattern Figure 





Figure 6. We seek the pattern, shown on the top left, in the signal shown on 
the top right. In the lower left, we plot the correlation values. The location of 
the maximum value should indicate the location of the pattern within the signal, 
but as we see in the lower right figure, placing the pattern, drawn in a thick line, 
at this “maximum” location is incorrect. 

Given how easy and direct the algorithm is, it would come as a surprise if there 
were not any caveats associated with it. 

In practice, the source and target images are probably not exactly identi- 
cal. This could easily result in multiple peaks, which means that the maximum 
peak may not be the correct one. This phenomenon is illustrated in Figure 6. 
Therefore, when using correlation to determine the proper rotation and trans- 
lation parameters, several potential sets of parameters, e.g., corresponding to 
the 4 largest correlation peaks, need to be tried. The best (in some sense, e.g., 
least-squares) is the value you choose. Secondly, the images certainly should be 
of the same modality. Registering an MR with a PET image probably won’t 
work at all! 

But on the bright side, along with computation efficiency, one can apply the 
technique to subregions of images and “glue” the results together. For example, 
one can divide the images into quarters, determine rotation and translation pa- 
rameters for each, all independent of each other, and then smoothly apply these 
four sets of parameters, to encompass a complete (and non-rigid) registration 
of the source to target image [31]. Also, as with Principal Axes, Fourier-based 
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correlation may be used to achieve coarse registrations, as starting points for 
fancier methods. 

3.3. Procrustes algorithm. The Procrustes Algorithm [19; 34] is an image 
registration algorithm that depends on the active participation of the user. It 
does have as its inspiration a rather colourful character from Greek mythology. 
Especially for this reason, we feel compelled to briefly mention it. 

It is a “one size fits all” algorithm: one image is compelled to fit another. 
The name is most appropriate for this algorithm. Procrustes is a character from 
Greek mythology. He was an innkeeper who guaranteed all his beds were the 
correct length for his guests. “The top of your head will be at precisely the top 
edge of the bed. Similarly the soles of your feet will be at the bottom edge.” 
And for his (unfortunate) guests of varying heights, they were. Procrustes would 
employ some rather gruesome measures to make his claim true. Ouch. 

As already mentioned, the algorithm depends on human intervention. Quite 
simply, the user identifies common features or landmarks in the images (so this is 
the feature space) and, by rigid rotation and translation (the search space), forces 
a registration that respects these landmarks. In a perfect world, to determine 
the proper rotation and translation parameters, three pairs of landmarks would 
suffice. The rotation parameters place the images in the same orientation, the 
translation parameters, well, translate the images into alignment. 

But we do not inhabit a perfect world. The slightest variation in distance be- 
tween any homologous pair represents an error in landmark identification which 
cannot be reconciled with rigid body motions. And so we need to compromise. 
(Procrustes would have difficulty understanding this. While his enthusiasm for 
achieving a perfect fit is admirable, it could result in some uncomfortable side 
effects for the patients.) Lacking a perfect match, the similarity metric employed 
is instead the mean squared distance between homologous landmarks when com- 
puting the six rigid body parameters. The search strategy is to minimize via 
least-squares. 

The good news is that this can be accomplished efficiently. A closed form 
solution exists, in fact. However, the not so good news is that it depends on the 
accurate identification of landmarks. If you say that the anatomical feature at 
Point A\ in source image A really corresponds with the anatomical feature at 
Point B\ in the target image B, you had better be right. And being right takes 
time, especially since the slightest deviation is a source of error. 

3.4. AIR: automated image registration. AIR is a sophisticated and 
powerful image registration algorithm. Developed by Woods et al [41; 42; 43], 
the feature space it uses consists of all the pixels in the image, and the search 
space consists of up to fifth-order polynomials in spatial coordinates x, y (and z, 
if 3-D), involving as many as 168 parameters. The goal is to define a single, 
global transformation. We outline some of AIR’s characteristics: 
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• AIR is a fully automated algorithm. 

• Unlike the algorithms so far discussed, AIR can be used in multi-modal situ- 
ations. 

• AIR does not depend on landmark identification. 

• AIR uses overall similarity between images. 

• AIR is iterative. 

It is a robust and versatile algorithm. The fact that AIR software is publicly 
available [1] has only added to its widespread use. 

AIR is based on the following assumption. If two images, acquired the same 
way (i.e., same modality) are perfectly aligned, then the ratio of one image to 
another, on a pixel by pixel basis, ought to be fairly uniform across voxels. If 
registration is not spot on correct, then there would be a substantial degree of 
nonuniformity in ratios. Ergo, to register the two images, compute the standard 
deviation of the ratio, and minimize it. This error function is called the “ratio 
of image uniformity”, or RIU. The algorithm’s search strategy is based on gra- 
dient descent, and the similarity metric is actually a normalized version of the 
RIU between the two volumes. An iterative procedure is used to minimize the 
normalized RIU in which the registration parameters (three rotation and three 
translation terms) with the largest partial derivative is adjusted in each iteration 

[41]. 

Since we are dealing with ratios and not pixel intensities themselves, it is this 
idea of using the ratios to register images which provides us with the flexibility 
to align images of different modalities. 

Suppose we are in the situation where we want to align an MR to a PET 
image. On the face of it, the ratios will not be uniform across the images. 
Different tissue types will have different ratios. However, and this is key, within 
a given tissue type, the ratio ought to be fairly uniform when the images are 
registered. Therefore, what you want to do is maximize the uniformity within 
the tissue type, where the tissue-typing is based on the MRI voxel intensity. 
This requires two modifications of the original algorithm [43]. First, one has 
to manually edit the scalp, skull and meninges from the MR image since these 
features are not present in the PET image. The second modification consists of 
first performing a histogram matching. Denote the two images to be histogram 
matched as /i ( • ) and fa ( • ) , and c-i ( • ) as the sampled cumulative distribution 
function of image fa(-). The histogram of fa(-) is made to match that of /i(-) 
by mapping each pixel fa(x , y ) to C 2 ( fi(x , y )), between the MR and PET images 
(with 256 bins), followed by a segmentation of the images according to the 256 
bin values. Each of the segmented MR and PET images (with corresponding bin 
values) are then registered separately. 

In terms of implementation, both the within-modality and cross-modality 
versions of the algorithm, the registration is performed on sub-sampled images, 
in decreasing order of sub-sampling. 
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There are a number of things to keep in mind. AIR’s global approach implies 
the transformation will be consistent throughout the entire image volume. How- 
ever, this does introduce the possibility of obtaining an unstable transformation, 
especially near the image boundaries. And small and/or local perturbations may 
result in disproportionate changes in the global transformation. And the AIR 
algorithm is also computationally intensive. It is not easy, after all, to minimize 
the standard deviation of the ratios. However, the algorithm does perform well 
with noisy data [36]. 

3.5. Mutual information based techniques. Mutual Information [39] is an 
error metric (or similarity metric) used in image registration based on ideas from 
Information Theory. Mutual Information uses the pixel intensities themselves. 
The strategy is this: minimize the information content of the difference image , 
i.e., the content of target-source. 

Consider Figure 7. The particular example is a bit of a cheat, but it illustrates 
the point. In the top row we have two axial images. They are the source 




Figure 7. The philosophy behind Mutual Information. The source is the top left 
image, and the target is the top right. The difference image between the aligned 
source and target (lower left) looks nearly completely blank. Some structure 
might be vaguely visible, but not nearly as much as the difference image resulting 
translating the aligned source by 1 pixel (lower right). 
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and target images. The image on the lower left is the difference between the 
aligned source and target. Since the pixel intensities of the source and target 
are nearly identical, the difference image is basically blank. Now suppose we 
take the aligned source and translate it by one pixel. In the resulting difference 
image, the boundary of the skull is quite obvious. Whereas in the first difference 
image one has to “hunt” for features (and fail to find any), in the second we do 
not. Features stand out. So, in a sense, the second difference image has more 
information than that first: we see a shape. Mutual Information wants that 
difference image to have as little information as possible. 

To go a little further, let us begin with the question: how well does one image 
explain, or “predict”, another? We use a joint probability distribution. Let 
p(a, b) denote the probability that a pixel value a in the source and b in the 
target occurs, for all a and b. We estimate the joint probability distribution by 
making a joint histogram of pixel values. When two images are in alignment, the 
corresponding anatomical area overlap, and hence there are lots of high values. 
In misalignment, anatomical areas are mixed up, e.g., brain over skin, and this 
results in a somewhat more dispersed joint histogram. See Figures 8 and 9. 

What we want to do is make the “crispiest” joint probability distribution 
possible. Let I(A,B) denote the Mutual Information of two images A and B. 
This can be defined in terms of the entropies (i.e., “How dispersed is the joint 
probability distribution?”) H{A ), H(B) and H(A,B ): 



I(A,B) = H(A) + H(B)-H(A,B)= £ p(x,y) log 2 

x£A, y£B 



f p(x,y) \ 
\p(x)p(y)J 



Therefore, to maximize their mutual information I(A,B), to get image A to 
tell us as much as possible about S, we need to minimize the entropy H(A, B). 
The reader is encouraged to read the seminal paper by Viola et al. [39] for fur- 
ther information regarding exactly how the entropy H(A,B) is minimized. In 
brief, [39] use a stochastic analog of the gradient descent technique to maximize 
I (A, B), after first approximating the derivatives of the mutual information error 
measure. In order to obtain these derivatives, the probability density functions 
are approximated by a sum of Gaussians using the Parzen-window method [16] 
(after this approximation, the derivatives can be obtained analytically). The 
geometric distortion model used is global affine. In general, the various im- 
plementations differ in the minimization technique. For example, Collignon et 
al. [14] use Powell’s method for the minimization. 

In the final analysis, we find that Mutual Information is quite good in multi- 
modal situations. However, it is computationally very expensive, as well as being 
sensitive to the how the interpolation is done, e.g., the minimum found may not 
be the correct/optimal one. 
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A functional image Perfect alignment 




One pixel off Three pixels off 




Figure 8. Joint histograms of identical source and target images. No registration 
is necessary to align them. The resulting joint histogram is a diagonal line. 
Translating by 1 pixel significantly disperses the diagonal (lower left), and by 3 
pixels, further still (lower right). 

3.6. Optic flow fields. This registration technique [30] borrows tools from dif- 
ferential flow estimation. The underlying philosophical principle of the algorithm 
is that we want to flow from the source to the target. Think of an air bubble 
that is rising to the surface of a lake. The bubble’s surface smoothly bends and 
flexes this way and that as it floats upward. The source and target images are 
two snapshots taken of the rising bubble. Starting from the two snapshots, the 
algorithm determines the deformations that occur when going from source to 
target. The source image is the bubble at t = 0, and the target image is the 
bubble at t = 1. What happened between 0 and 1 ? 

The highlights of this technique are: 

• The technique based on differential flow estimation. 

• Idea: Want to flow from the source image to reference image. 

• The procedure is fully automated. 

• Uses an affine model. 

• Allows for intensity variations between the source and target images. 
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Source Target 




Figure 9. Joint histograms of different source and target images. While not 
strictly a diagonal line, the joint histogram of the aligned source and target 
images is relatively narrow (lower left). Translating by one pixel significantly 
disperses the diagonal (lower right). 

Full details and results of the algorithm may be found in [30]. Since the model 
is very straightforward, we will delve a little deeper into this algorithm than we 
have so far with the previous algorithms discussed. It can be considered as an 
example of how, beginning with basic principles, a registration technique is born. 

Our starting point is the general form of a 2-D affine transformation: 



Xi 





mi 


m 2 


X 


+ 


m 5 


.2/1. 




ra 3 


777-4 


y . 




_m 6 _ 



where x,y denote spatial coordinates in the source image and X\,y± denote spa- 
tial coordinates in the target. Depending on the values mi, m 2 , m 3 and m. 4 , 
certain well known geometric transformations can result (see Figure 10). 

Now recall our description at the beginning of this section, that of a bubble 
rising through the water. We took two snapshots, one at t = 0, and one at 
t = 1, of the same bubble. Hence it is reasonable to have a single function, with 
temporal variable t, represent the bubble at time t. 

With this in mind, let f(x,y,t), f(x,y,t—l) represent the source and target 
images, respectively. To further simplify the model, at least for the moment, we 
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original: 1 



rotation: 



cos 8 sin 8 
— sin 8 cos 8 





scaling: 



mi 

0 



0 

7714 



shear: 



1 7712 

771 3 1 



Figure 10. A smattering of linear transformations. 



will make the “Brightness-Constancy” assumption: identical anatomical features 
in both images will have the same pixel intensity. That is, we are not allowing 
for the possibility that, say, the left eye in the MR source image to be brighter or 
darker than the left eye in the MR target image. Before tackling more difficult 
issues later, we want to ensure that only an affine transformation, and nothing 
else, is required to mold the source into the target. 

Using the notation we have just introduced (which we will slightly abuse now), 
we have the situation: 



f(x,y,t ) = f(m 1 x + m 2 y + m 5 , m 3 x + m 4 y + m 6 , t- 1) 



(3-1) 
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We use a least squares approach to estimate the parameters to = (mi . . . me) T 
in (3-1). Now the function we really want to minimize is: 

E(m) = ^2 (f(x,y,t)~ f(mix + m 2 y + m 5 , m 3 x + m 4 y + m. 6 , t-1)) 2 (3-2) 

x,y £ Q 

where f 1 denotes the region of interest. However, the fact that E(fh) is not linear 
means that minimizing will be tricky. So we take an easy way out and instead 
take its truncated, first-order Taylor series expansion. Letting 



k = ft + xf x + yfy, 

rp & ) 

C = (xf x yf x xfy yfy f x fy ) , 

where the subscripts denote partial derivatives, we eventually arrive at this much 
more reasonable error function: 

E(rh) = ^2 (k — ^rh) 2 . (3-4) 

x,y £ Q 



To minimize (3-4), we differentiate with respect to to: 



dE x - . 

m = E- 2c(k - 

Q 



-+T ->\ 

c m) 



set equal to 0, and solve for the model parameters to obtain: 

rn = 



n 



n 



(3-5) 



And lo! we have determined to. However, there is a caveat. We are assuming 
that the 6x6 matrix (]U n c?) in (3-5) is, in fact, invertible. We can usually 
guarantee this by making sure that the spatial region fi is large enough to have 
sufficient image content, e.g., we would want some “interesting” features in O like 
edges, and not simply a “bland” area. The parameters to are for the region H. 

In terms of actually implementation, the parameters to are estimated locally, 
for different spatial neighborhoods. By applying this algorithm in a multi-scale 
fashion, it is possible to capture large motions. (See [30] for details.) This is 
illustrated in Figure 11, in the case where the target image is a synthetically 
warped version of the source image. 



Editorial. As an aside, we mention that doing an experiment such as this, reg- 
istering an image with a warped version of itself is not altogether silly. If an 
algorithm being developed fails in an ideal test case such as this, chances are 
very good that it will fail for genuinely different images. However, to make a 
“fair” ideal test, the method of warping the image should be independent of the 
registration method. For example, if the registration algorithm is to determine 
an affine transform, do not warp the image using an affine transform. Use some 
other method, e.g., apply Bookstein’s thin-plate splines [9]. 
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Figure 11. Flowing from source to target: An "ideal” experiment. 

The optic flow model can next be modified to account for differences of con- 
trast and brightness between the two images with the addition of two new pa- 
rameters, TO 7 for contrast, and m$ for brightness. The new version of (3-1) is 

m 7 f(x,y,t) + m$ = f(mix + m 2 y + m 5 ,m 3 x + m 4 y + me,t—l). (3-6) 

We are also assuming that, in addition to the affine parameters, the brightness 
and contrast parameters are constant within small spatial neighborhoods. 

Minimizing the least squares error as before, using a first-order Taylor series 
expansion, gives a solution identical in form to (3-5) except that this time 

k=ft-f + xf x + yf y , 

C = {xf x yf x xfy yfy f x fy -/ -1) T ; 

compare equations (3-3). 

Now, we have been working under the assumption that the affine and con- 
trast/brightness parameters are constant within some small spatial neighbor- 
hood. This introduces two conflicting conditions. 

Recall (X^qCC 5 ’). This matrix needs to have an inverse. As was mentioned 
earlier, this can be arranged by considering a large enough region Cl, i.e., a region 
with sufficient image content. However, the larger the area, the less likely it is 
that the brightness constancy assumption holds. Think about it: image content 
can be edges, and edges can have very different intensities, when compared with 
surrounding tissue. 

Fortunately, the model can be modified one more time. Instead of a single 
error function (3-4), we can instead consider the sum of two errors: 

E(m) = Eb(rh) + E a (m) (3-8) 

Eb(m) = (k — cFrn) 2 



where 
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Source 



Target 



Registered result 




Figure 12. Registering an excessively distorted source image to a target image. 

with k and c defined as in (3-7) and (??), and 



8 

E s (m) = Xj 

i=l 




where A * is a positive constant, set by the user, that weights the smoothness 
constraint imposed on m, . 

As before, one works with Taylor series expansions of (3-8) , but things become 
a little more complicated. Complete details of how to work with (3-8), as well 
with generalizations to 3-D, may be found in [30]. Some results are shown in 
Figures 12-13. 



4. Conclusion 

We have presented a whirlwind introduction to image registration for MRI. 
After providing a theoretical framework by which the problem is defined, we 
presented, in no particular order, a number of different algorithms. We then 
provided a more detailed discussion of an algorithm based on the idea of optic 
flow fields. 

Our intent in this paper was to illustrate how the problem of image registration 
can have a wide variety of very dissimilar solutions. And there exist many more 
techniques than those presented here. For example, image features that some 
of these methods depend upon include surfaces [28; 15; 17], edges [27; 21], and 
contours [26; 35]. There are also methods based on B-splines [37; 22; 33], thin- 
plate splines [9; 10], and low-frequency discrete cosine basis functions [3; 4]. 

There are many survey articles the reader may wish to read, to learn more 
about medical image registration, In addition to those cited earlier ([11; 38]), we 
also call attention to [25; 24; 23; 40]. The simple existence of so many techniques 
provides more than sufficient support for the thesis that there are many paths 
to the One Truth: perfect image alignment. 
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Source 




Registered edge difference 




Target 




Registered result 




Figure 13. Registering two different clinical images. The lower left image shows 
how the edges of the registered source compare with the target's edges. The 
lower right image shows the registered source itself, after it has undergone both 
geometric and intensity-correction transformations. 
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Image Compression: 

The Mathematics of JPEG 2000 

JIN LI 



Abstract. We briefly review the mathematics in the coding engine of 
JPEG 2000, a state-of-the-art image compression system. We focus in 
depth on the transform, entropy coding and bitstream assembler modules. 
Our goal is to present a general overview of the mathematics underlying a 
state of the art scalable image compression technology. 



1. Introduction 

Data compression is a process that creates a compact data representation 
from a raw data source, usually with an end goal of facilitating storage or trans- 
mission. Broadly speaking, compression takes two forms, either lossless or lossy, 
depending on whether or not it is possible to reconstruct exactly the original 
datastream from its compressed version. For example, a data stream that con- 
sists of long runs of Os and Is (such as that generated by a black and white 
fax) would possibly benefit from simple run-length encoding , a lossless technique 
replacing the original datastream by a sequence of counts of the lengths of the 
alternating substrings of Os and Is. Lossless compression is necessary for situ- 
ations in which changing a single bit can have catastrophic effects, such as in 
machine code of a computer program. 

While it might seem as though we should always demand lossless compres- 
sion, there are in fact many venues where exact reproduction is unnecessary. In 
particular, media compression, which we define to be the compression of im- 
age, audio, or video files, presents an excellent opportunity for lossy techniques. 
For example, not one among us would be able to distinguish between two images 
which differ in only one of the 2 29 bits in a typical 1024 x 1024 color image. Thus 
distortion is tolerable in media compression, and it is the content, rather than 



Keywords: Image compression, JPEG 2000, transform, wavelet, entropy coder, subbitplane 
entropy coder, bitstream assembler. 
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the exact bits, that is of paramount importance. Moroever, the size of the orig- 
inal media is usually very large, so that it is essential to achieve a considerably 
high compression ratio (defined to be the ratio of the size of the original data 
file to the size of its compressed version). This is achieved by taking advantage 
of psychophysics (say by ignoring less perceptible details of the media) and by 
the use of entropy coding, the exploitation of various information redundancies 
that may exist in the source data. 

Conventional media compression solutions focus on a static or one-time form 
of compression — i.e. , the compressed bitstream provides a static representation 
of the source data that makes possible a unique reconstruction of the source, 
whose characteristics are quantified by a compression ratio determined at the 
time of encoding. Implicit in this approach is the notion of a “one shoe fits all” 
technique, an outcome that would appear to be variance with the multiplicity 
of reconstruction platforms upon which the media will ultimately reside. Often, 
different applications may have different requirements for the compression ratio 
as well as tolerating various levels of compression distortion. A publishing ap- 
plication may require a compression scheme with very little distortion, while a 
web application may tolerate relatively large distortion in exchange for smaller 
compressed media. 

Recently scalable compression has emerged as a category of media compres- 
sion algorithms capable of trading between compression ratio and distortion after 
generating an initially compressed master bitstream. Subsets of the master then 
may be extracted to form particular application bitstreams which may exhibit 
a variety of compression ratios. (I.e., working from the master bitstream we 
can achieve a range of compressions, with the concomitant ability to reconstruct 
coarse to fine scale characteristics.) With scalable compression, compressed me- 
dia can be tailored effortlessly for applications with vastly different compression 
ratio and quality requirements, a property which is particularly valuable in media 
storage and transmission. 

In what follows, we restrict our attention to image compression, in particular, 
focusing on the JPEG 2000 image compression standard, and thereby illustrate 
the mathematical underpinnings of a modern scalable media compression algo- 
rithm. The paper is organized as follows. The basic concepts of the scalable 
image compression and its applications are discussed in Section 2. JPEG 2000 
and its development history are briefly reviewed in Section 3. The transform, 
quantization, entropy coding, and bitstream assembler modules are examined 
in detail in Sections 4 to 7. Readers interested in further details may refer to 
[i; 2; 3], 



2. Image Compression 



Digital images are used every day. A digital image is essentially a 2D data 
array x(i,j), where i and jindex the row and column of the data array, and 
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x(i.j)is referred to as a pixel. Gray-scale images assign to each pixel a single 
scalar intensity value G, whereas color images traditionally assign to each pixel 
a color vector (R,G,B), which represent the intensity of the red, green, and 
blue components, respectively. Because it is the content of the digital image 
that matters, the underlying 2D data array may undergo big changes while still 
conveying the content to the user with little or no perceptible distortion. An 
example is shown in Figure 1. On the left the classic image processing test case 
Lena is shown as a 512 x 512 grey-scale image. To the right of the original 
are several applications, each showing different sorts of compression. The first 
application illustrates the use of subsampling in order to fit a smaller image (in 
this case 256 x 256) . The second application uses JPEG (the predecessor to JPEG 
2000) to compress the image to a bitstream, and then decode the bitstream back 
to an image of size 512 x 512. Although in each case the underlying 2D data array 
is changed tremendously, the primary content of the image remains intelligible. 





Compress (JPEG) 



Figure 1 . Souce digital image and compressions. 



Each of the applications above results in a reduction in the amount of source 
image data. In this paper, we focus our attention on JPEG 2000, which is a 
next generation image compression standard. JPEG 2000 distinguishes itself 
from older generations of compression standards not only by virtue of its higher 
compression ratios, but also by its many new functionalities. The most noticeable 
among them is its scalability. From a compressed JPEG 2000 bitstream, it is 
possible to extract a subset of the bitstream that decodes to an image of variable 
quality and resolution (inversely correlated with its accompanying compression 
ratio), and/or variable spatial locality. 

Scalable image compression has important applications in image storage and 
delivery. Consider the application of digital photography. Presently, digital 
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cameras all use non-scalable image compression technologies, mainly JPEG. A 
camera with a fixed amount of the memory can accommodate a small number 
of high quality, high-resolution images, or a large number of low quality, low- 
resolution images. Unfortunately, the image quality and resolution must be 
determined before shooting the photos. This leads to the often painful trade-off 
between removing old photos to make space for new exciting shots, and shooting 
new photos of poorer quality and resolution. Scalable image compression makes 
possible the adjustment of image quality and resolution after the photo is shot, 
so that instead, the original digital photos always can be shot at the highest 
possible quality and resolution, and when the camera memory is filled to capacity, 
the compressed bitstream of existing shots may be truncated to smaller size 
to leave room for the upcoming shots. This need not be accomplished in a 
uniform fashion, with some photos kept with reduced resolution and quality, 
while others retain high resolution and quality. By dynamically trading between 
the number of images and the image quality, the use of precious camera memory 
is apportioned wisely. 

Web browsing provides another important application of scalable image com- 
pression. As the resolution of digital cameras and digital scanners continues to 
increase, high-resolution digital imagery becomes a reality. While it is a plea- 
sure to view a high-resolution image, for much of our web viewing we’d trade the 
resolution for speed of delivery. In the absence of scalable image compression 
technology it is common practice to generate multiple copies of the compressed 
bitstream, varying the spatial region, resolution and compression ratio, and put 
all copies on a web server in order to accommodate a variety of network situa- 
tions. The multiple copies of a fixed media source file can cause data management 
headaches and waste valuable server space. Scalable compression techniques al- 
low a single scalable master bitstream of the compressed image on the server 
to serve all purposes. During image browsing, the user may specify a region 
of interest (ROI) with a certain spatial and resolution constraint. The browser 
then only downloads a subset of the compressed media bitstream covering the 
current ROI, and the download can be performed in a progressive fashion so that 
a coarse view of the ROI can be rendered very quickly and then gradually refined 
as more and more bits arrive. Therefore, with scalable image compression, it is 
possible to browse large images quickly and on demand (see e.g., the Vmedia 
project [25]). 



3. JPEG 2000 

3.1. History. JPEG 2000 is the successor to JPEG. The acronym JPEG stands 
for Joint Photographic Experts Group. This is a group of image processing ex- 
perts, nominated by national standard bodies and major companies to work to 
produce standards for continuous tone image coding. The official title of the 
committee is “ISO/IEC JTC1/SC29 Working Group 1”, which often appears in 
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the reference document. The JPEG members select a DCT based image com- 
pression algorithm in 1988, and while the original JPEG was quite successful, 
it became clear in the early 1990s that new wavelet-based image compression 
schemes such as CREW (compression with reversible embedded wavelets) [5] 
and EZW (embedded zerotree wavelets) [6] were surpassing JPEG in both per- 
formance and available features, such as scalability. It was time to begin to 
rethink the industry standard in order to incorporate these new mathematical 
advances. 

Based on industrial demand, the JPEG 2000 research and development effort 
was initiated in 1996. A call for technical contributions was issued in March 
1997 [17]. The first evaluation was performed in November 1997 in Sydney, 
Australia, where twenty-four algorithms were submitted and evaluated. Follow- 
ing the evaluation, it was decided to create a JPEG 2000 “verification model” 
(VM) which was a reference implementation (in document and in software) of 
the working standard. The first VM (VM0) is based on the wavelet/trellis coded 
quantization (WTCQ) algorithm submitted by SAIC and the University of Ari- 
zona (SAIC/UA) [18]. At the November 1998 meeting, the algorithm EBCOT 
(embedded block coding with optimized truncation) was adopted into VM3, and 
the entire VM software was re-implemented in an object-oriented manner. The 
document describing the basic JPEG 2000 decoder (part I) reached committee 
draft (CD) status in December 1999. JPEG 2000 finally became an international 
standard (IS) in December 2000. 

3.2. JPEG. In order to understand JPEG 2000, it is instructive to revisit the 
original JPEG. As illustrated by Figure 2, JPEG is composed of a sequence of 
four main modules. 




COMP &| 
PART 



DCT 



QUAN 



JPEG 

RUN-LEVELl 

CODING 



FINAL 

BITSTR 



Figure 2. Operation flow of JPEG. 

The first module (COMP & PART) performs component and tile separation, 
whose function is to cut the image into manageable chunks for processing. Tile 
separation is simply the separation of the image into spatially non-overlapping 
tiles of equal size. Component separation makes possible the decorrelation of 
color components. For example, a color image, in which each pixel is nor- 
mally represented with three numbers indicating the levels of red, green and 
blue (RGB) may be transformed to LCrCb (luminance, chrominance red and 
chrominance blue) space. 
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After separation, each tile of each component is then processed separately 
according to a discrete cosine transform (DCT). This is closely related to the 
Fourier transform (see [30], for example). The coefficients are then quantized. 
Quantization takes the DCT coefficients (typically some sort of floating point 
number) and turns them into an integer. For example, simple rounding is a 
form of quantization. In the case of JPEG, we apply rounding plus a mask 
which applies a system of weights reflecting various psychoacoustic observations 
regarding human processing of images [31]. Finally, the coefficients are subjected 
to a form of run-level encoding, where the basic symbol is a run-length of zeros 
followed by a non-zero level, the combined symbol is then Huffman encoded. 



3.3. Overview of JPEG 2000. Like JPEG, JPEG 2000 standardizes the 
decoder and the bitstream syntax. The operation flow of a typical JPEG 2000 
encoder is shown in Figure 3. 







COLOR 

IMAGE 





Figure 3. Flowchart for JPEG 2000. 



We again start with a component and tile separation module. After this 
preprocessing, we now apply a wavelet transform which yields a sequence of 
wavelet coefficients. This is a key difference between JPEG and JPEG 2000 
and we explain it in some detail in Section 4. We next quantize the wavelet 
coefficients which are then regrouped to facilitate localized spatial and resolution 
access, where by “resolution” we mean effectively the “degree” of the wavelet 
coefficient, as the wavelet decomposition is thought of as an expansion of the 
original data vector in terms of a basis which accounts for finer and finer detail, 
or increasing resolution. The degrees of resolution are organized into subbands, 
which are divided into non-overlapping rectangular blocks. Three spatially co- 
located rectangles (one from each subband at a given resolution level) form a 
packet partition. Each packet partition is further divided into code-blocks , each 
of which is compressed by a subbitplane coder into an embedded bitstream with 
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a rate- distortion curve that records the distortion and rate at the end of each 
subbitplane. The embedded bitstream of the code-blocks are assembled into 
packets, each of which represents an increment in quality corresponding to one 
level of resolution at one spatial location. Collecting packets from all packet 
partitions of all resolution level of all tiles and all components, we form a layer 
that gives one increment in quality of the entire image at full resolution. The 
final JPEG 2000 bitstream may consist of multiple layers. 

We summarize the main differences: 

(1) Transform module: wavelet versus DCT. JPEG uses 8x8 discrete cosine 
transform (DCT), while JPEG 2000 uses a wavelet transform with lifting 
implementation (see Section 4.1). The wavelet transform provides not only 
better energy compaction (thus higher coding gain), but also the resolution 
scalability. Because the wavelet coefficients can be separated into different 
resolutions, it is feasible to extract a lower resolution image by using only the 
necessary wavelet coefficients. 

(2) Block partition: spatial domain versus wavelet domain. JPEG partitions 
the image into 16 x 16 macroblocks in the space domain, and then applies 
the transform, quantization and entropy coding operation on each block sep- 
arately. Since blocks are independently encoded, annoying blocking artifacts 
becomes noticeable whenever the coding rate is low. On the contrary, JPEG 
2000 performs the partition operation in the wavelet domain. Coupled with 
the wavelet transform, there is no blocking artifact in JPEG 2000. 

(3) Entropy coding module: run-level coefficient coding versus bitplane coding. 

JPEG encodes the DCT transform coefficients one by one. The resultant block 
bitstream can not be truncated. JPEG 2000 encodes the wavelet coefficients 
bitplane by bitplane (i.e., sending all zeroth order bits, then first order, etc. 
Details are in Section 4.3). The generated bitstream can be truncated at any 
point with graceful quality degradation, ft is the bitplane entropy coder in 
JPEG 2000 that enables the bitstream scalability. 

(4) Rate control: quantization module versus bitstream assembly module. In 

JPEG, the compression ratio and the amount of distortion is determined by 
the quantization module. In JPEG 2000, the quantization module simply 
converts the float coefficient of the wavelet transform module into an integer 
coefficient for further entropy coding. The compression ratio and distortion 
is determined by the bitstream assembly module. Thus, JPEG 2000 can 
manipulate the compressed bitstream, e.g., convert a compressed bitstream 
to a bitstream of higher compression ratio, form a new bitstream of lower 
resolution, form a new bitstream of a different spatial area, by operating only 
on the compressed bitstream and without going through the entropy coding 
and transform module. As a result, JPEG 2000 compressed bitstream can be 
reshaped (transcoded) very efficiently. 




192 



JIN LI 



4. The Wavelet Transform 

4.1. Introduction. Most existing high performance image coders in applica- 
tions are transform based coders. In the transform coder, the image pixels are 
converted from the spatial domain to the transform domain through a linear 
orthogonal or bi-orthogonal transform. A good choice of transform accomplishes 
a decorrelation of the pixels, while simultaneously providing a representation in 
which most of the energy is usually restricted to a few (realtively large) coeffi- 
cients. This is the key to achieving an efficient coding (i.e., high compression 
ratio). Indeed, since most of the energy rests in a few large transform coeffi- 
cients, we may adopt entropy coding schemes, e.g., run-level coding or bitplane 
coding schemes, that easily locate those coefficients and encodes them. Because 
the transform coefficients are highly decorrelated, the subsequent quantizer and 
entropy coder can ignore the correlation among the transform coefficients, and 
model them as independent random variables. 

The optimal transform (in terms of decorrelation) of an image block can be 
derived through the Karhunen-Loeve (K-L) decomposition. Here we model the 
pixels as a set of statistically dependent random variables, and the K-L basis is 
that which achieves a diagonalization of the (empirically determined) covariance 
matrix. This is equivalent to computing the SVD (singular value decomposition) 
of the covariance matrix (see [28] for a thorough description). However, the K-L 
transform lacks an efficient algorithm, and the transform basis is content depen- 
dent (in distinction, the Fourier transform, which uses the sampled exponentials, 
is not data dependent). 

Popular transforms adopted in image coding include block-based transforms, 
such as the DCT, and wavelet transforms. The DCT (used in JPEG) has many 
well-known efficient implementations [26] , and achieves good energy compaction 
as well as coefficient decorrelation. However, the DCT is calculated indepen- 
dently in spatially disjoint pixel blocks. Therefore, coding errors (i.e., lossy 
compression) can cause discontinuities between blocks, which in turn lead to 
annoying blocking artifacts. In contrary, the wavelet transform operates on the 
entire image (or a tile of a component in the case of large color image), which 
both gives better energy compaction than the DCT, and no post-coding blocking 
artifact. Moreover, the wavelet transform decomposes the image into an L-level 
dyadic wavelet pyramid. The output of an example 5-level dyadic wavelet pyra- 
mid is shown in Figure 4. 

There is an obvious recursive structure generated by the following algorithm: 
lowpass and highpass filters (explained below, but for the moment, assume that 
these are convolution operators) are applied independently to both the rows and 
columns of the image. The output of these filters is then organized into four 
new 2D arrays of one half the size (in each dimension), yielding a LL (lowpass, 
lowpass) block, LH (lowpass, highpass), HL block and HH block. The algorithm 
is then applied recursively to the LL block, which is essentially a lower resolution 
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ORIGINAL 
128, 129, 125, 64, 65, 



TRANSFORM COEFFICIENTS 
4123, -12.4, -96.7, 4.5, 



Figure 4. A 5-level dyadic wavelet pyramid. 

or smoothed version of the original. This output is organized as in Figure 4, with 
the southwest, southeast, and northeast quadrants of the various levels housing 
the LH, HH, and HL blocks respectively. We examine their structure as well as 
the algorithm in Sections 4.2 and 4.3. By not using the wavelet coefficients at 
the finest M levels, we can reconstruct an image that is 2 M times smaller in both 
the horizontal and vertical directions than the original one. The multiresolution 
nature (see [27], for example) of the wavelet transform is ideal for resolution 
scalability. 

4.2. Wavelet transform by lifting. Wavelets yield a signal representation in 
which the low order (or lowpass) coefficients represent the most slowly changing 
data while the high order (lrighpass) coefficients represent more localized changes. 
It provides an elegant framework in which both short term anomaly and long 
term trend can be analyzed on an equal footing. For the theory of wavelet and 
multiresolution analysis, we refer the reader to [7; 8; 9]. 

We develop the framework of a one-dimensional wavelet transform using the 
2 -transform formalism. In this setting a given (bi-infinite) discrete signal x[n] is 
represented by the Laurent series X(z) in which x[n\ is the coefficient of z n . The 
^-transform of a FIR filter ( finite impulse response, meaning Laurent series with 
a finite number of nonzero coefficients, and thus a Laurent polynomial) H(z) is 
represented by a Laurent polynomial 

9 

H(z ) = ^ h{k)z~ k of degree \H\ = q — p. 

k—p 

Thus the length of a filter is the degree of its associated polynomial plus one. The 
sum or difference of two Laurent polynomials is again a Laurent polynomial and 
the product of two Laurent polynomials of degree a and b is a Laurent polynomial 
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of degree a + b. Exact division is in general not possible, but division with 
remainder is possible. This means that for any two nonzero Laurent polynomials 
a{z) and b(z), with |a(z)| > |6(^r) | , there will always exist a Laurent polynomial 
q(z) with \q(z)\ = |a(z)| — |6(z)| and a Laurent polynomial r(z) with \r(z)\ < 

| b(z) | such that 

a(z) = b(z)q(z) + r(z). 

This division is not necessarily unique. A Laurent polynomial is invertible if and 
only if it is of degree zero, i.e., if it is of the form cz p . 

The original signal X {z) goes through a low and high-pass analysis FIR filter 
pair G(z) and H{z). These are simply the independent convolutions of the origi- 
nal data sequence against a pair of masks, and constitute perhaps the most basic 
example of a filterbank [27] . The resulting pair of outputs are subsampled by a 
factor of two. To reconstruct the original signal, the low and high-pass coeffi- 
cients 7 (z) and A(A) are upsampled by a factor of two and pass through another 
pair of synthesis FIR filters G'(z) and H'(z). Although HR (infinite impulse 
response) filters can also be used, the infinite response leads to an infinite data 
expansion, an undesirable outcome in our finite world. According to filterbank 
theory, if the filters satisfy the relations 

G{z)G{z~ 1 ) + H , (z)H(z~ 1 ) = 2, 

G(z)G(-z~ 1 ) + H'(z)H(-z~ 1 ) = 0, 

the aliasing caused by the subsampling will be cancelled, and the reconstructed 
signal Y(z) will be equal to the original. Figure 5 provides an illustration. 




Figure 5. Convolution implementation of one dimensional wavelet transform. 

A wavelet transform implemented in the fashion of Figure 5 with FIR filters is 
said to have a convolutional implementation , reflecting the fact that the signal is 
convolved with the pair of filters (h, g) that form the filter bank. Note that only 
half the samples are kept by the subsampling operator, and the other half of the 
filtered samples are thrown away. Clearly this is not efficient, and it would be 
better (by a factor of one-half) to do the subsampling before the filtering. This 
leads to an alternative implementation of the wavelet transform called lifting 
approach. It turns out that all FIR wavelet filters can be factored into lifting 
step. We explain the basic idea in what follows. For those interested in a deeper 
understanding, we refer to [ 10 ; 11 ; 12 ]. 
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The subsampling that is performed at the forward wavelet, and the upsam- 
pling that is used in the inverse wavelet transform suggest the utility of a decom- 
position of the 2-transform of the signal/filter into an even and odd part given 
by subsampling the 2-transform at the even and odd indices, respectively: 



ff(2)=5>(»)*~" 

n 



H e (z) = J 2 n h{ 2 n) z n (even part), 
H 0 (z) = J 2 n M 2n + 1 )z~ n (odd part). 



The odd/even decomposition can be rewritten as 



H(z) = H e (z 2 ) + z 1 H 0 (z 2 ) with 



H e (z)= i(7J(2 1 / 2 )+ff(-2 1 / 2 )), 
Ho{z) = ^z 1 ^ 2 (H{z 1 / 2 ) - ff(-2 1 / 2 )). 



With this we may rewrite the wavelet filtering and subsampling operation (i.e., 
the lowpass and highpass components, 7(2) and A(2), respectively) using the 
even/odd parts of the signal and filter as 



7(2) = G e {z)X e (z) + z 1 G 0 (z)X 0 (z), 
A (2) = H e (z)X e (z ) + z~ 1 H 0 (z)X 0 (z), 



which can be written in matrix form as 



( l(z) 

\K*) 



P(z) 



( X e (z) 
\z~ 1 X 0 (z) 



where P(z) is the polyphase matrix 



P(2) 



(G e {z) G 0 (2)\ 
\H e (z) H 0 (z) ) ' 




LOW PASS- 
COEFF7(:) 



HIGH PASS 
COEFF A (z) ' 




Figure 6. Single stage wavelet filter using polyphase matrices. 



The forward wavelet transform now becomes the left part of Figure 6. Note 
that with polyphase matrix, we perform the subsampling (split) operation before 
the signal is filtered, which is more efficient than the description illustrated by 
Figure 5, in which the subsampling is performed after the signal is filtered. We 
move on to the inverse wavelet transform. It is not difficult to see that the 
odd/even subsampling of the reconstructed signal can be obtained through 



( Ye{z) \ 

\zY 0 (z)J 



P(z) 



( 7(2) \ 

\Kz)) 
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where P'{z) is a dual polyphase matrix 



P\z) 



(G' e {z) G' 0 (z) \ 
\G' e {z) H' 0 (z) ) 



The wavelet transform is invertible if the two polyphase matrices are inverse 
to each other: 

P’G) = PM -1 = 1 ( H °^ ~ G °( Z ) A 

H 0 (z)G e (z) - H e (z)G 0 (z) \-H e (z) G e {z) ) ' 

If we constrain the determinant of the polyphase matrix to be one, i.e., 
H 0 (z)G e (z ) — H e (z)G 0 (z) = 1, then not only are the polyphase matrices in- 
vertible, but the inverse filter has a simple relationship to the forward filter: 



G' e (z) = H 0 (z), H' e (z) = —G 0 (z), 

G' 0 (z) = -ffe(z), H' 0 (z) = G 2 (z), 



which implies that the inverse filter is related to the forward filter by the equa- 
tions 

G' e (z) = z^Hi-z- 1 ), H'(z ) = -z^Gi-z- 1 ) 

The corresponding pair of filters ( g , h ) is said to be complementary. Figure 6 
illustrates the forward and inverse transforms using the polyphase matrices. 

With the Laurent polynomial and polyphase matrix, we can factor a wavelet 
filter into the lifting steps. Starting with a complementary filter pair ( g,h ), 
assume that the degree of filter g is larger than that of filter h. We seek a new 
filter g new satisfying 

g{z) = h9z)t(z 2 ) + g new (z), 

where t(z) is a Laurent polynomial. Both t(z) and g new ( z) can be calculated 
through long division [10]. The new filter g new is complementary to filter h, as 
the polyphase matrix satisfies 



P(z) 



(H e (z)t(z)+G?™(z) 

\ He{z) 

( 1 t{z)\(G^{z) 
Vo 1 ) \ H e (z) 



H 0 (z)t(z) + G»™(z)\ 
H 0 (z) ) 

GT w {z)\ = ( 1 t(z)\ 

Ho(z) ) \0 1 ) 



P new {z). 



Obviously, the determinant of the new polyphase matrix P new (z) also equals 
one. By performing the operation iteratively, it is possible to factor the polyphase 
matrix into a sequence of lifting steps: 



P(z) = 






The resultant lifting wavelet can be shown in Figure 7. 
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LOW PASS 
COEFF7 (s) 



HIGH PASS 
COEFFAfe) 



Figure 7. Multi-stage forward lifting wavelet using polyphase matrices. 



Each lifting stage above can be directly inverted. Thus we can invert the 
entire wavelet: 



P\z) = P(z )- 1 






We show the inverse lifting wavelet using polyphase matrices in Figure 8, 
which should be compared with Figure 7. Only the direction of the data flow 
has changed. 




LOW PASS 
COEFF7(t) 



HIGH PASS 
COEFFA(j) 



Figure 8. Multi-stage inverse lifting wavelet using polyphase matrices. 



4.3. Bi-orthogonal 9-7 wavelet and boundary extension. The default 
wavelet filter used in JPEG 2000 is the bi-orthogonal 9-7 wavelet [20]. It is 
a 4-stage lifting wavelet, with lifting filters Si(z) = f(a,z), ti(z) = f{b,z), 
S 2 (z) = /(c, z), to(z) = f(d , z), where /, the dual lifting step, is of the form 

f(p, z) =pz~ l +p. 

The quantities a, b, c and d are the lifting parameters at each stage. 

The next several figures illustrate the filterbank. The input data is indexed 
as . . . ,Xq,xi, . . . ,x n , . . . , and the lifting operation is performed from right to 
left, stage by stage. At this moment, we assume that the data is of infinite 
length, and we will discuss boundary extension later. The input data are first 
partitioned into two groups corresponding to even and odd indices. During each 
lifting stage, only one of the group is updated. In the first lifting stage, the odd 
index data points x\, X 3 , . . . are updated: 



%2n-\- 1 ^2n+ 1 T U * (x2n T %2n-\-2) 
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where a and x' 2n+1 are respectively the first stage lifting parameter and outcome. 
The entire operation corresponds to the filter 51 ( 2 ) represented in Figure 8. The 
circle in Figure 9 illustrates one such operation performed on x\. 




Original ■ High Low 




Figure 9. Bi-orthogonal 9-7 wavelet. 

The second stage lifting, which corresponds to the filter t\{z) in Figure 8, 
updates the data at even indices: 

X 2 n = X 2 n + b* (x' 2n _ 1 + x' 2n+1 ), 

where b and x 2n are the second stage lifting parameter and output. The third 
and fourth stage lifting can be performed similarly: 

H n = x 2n+1 + c * (x 2n + x 2n+2 ), 

Ln x 2n T d * (H n _i T Ff n ), 

where H n and L n are the resultant high and low-pass coefficients. The value of 
the lifting parameters a, b, c, d are shown in Figure 9. 

As illustrated in Figure 10, we may invert the dataflow, and derive an inverse 
lifting of the 9-7 bi-orthogonal wavelet. 

Since the actual data in an image transform is finite in length, boundary ex- 
tension is a crucial part of every wavelet decomposition scheme. For a symmetric 
ocld-tap filter (the bi-orthogonal 9-7 wavelet falls into this category), symmetric 
boundary extension can be used. The data are reflected symmetrically along 
the boundary, with the boundary points themselves not involved in the reflec- 
tion. An example boundary extension with four data points Xq, X \ , x 2 and X3 
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■ TRANSFORM 




Original * High Low 



INVERSE TRANSFORM L 




Figure 10. Forward and inverse lifting (9-7 bi-orthogonal wavelet). 



is shown in Figure 11. Because both the extended data and the lifting struc- 
ture are symmetric, all the intermediate and final results of the lifting are also 
symmetric with respect to the boundary points. Using this observation, it is 
sufficient to double the lifting parameters of the branches that are pointing to- 
ward the boundary, as shown in the middle of Figure 11. Thus, the boundary 
extension can be performed without additional computational complexity. The 
inverse lifting can again be derived by inverting the dataflow, as shown in the 
right of Figure 11. Again, the parameters for branches that are pointing toward 
the boundary points are doubled. 




Figure 11. Symmetric boundary extension of bi-orthogonal 9-7 wavelet on 4 
data points. 
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4.4. Two-dimensional wavelet transform. To apply a wavelet transform 
to an image we need to use a 2D version. In this case it is common to apply 
the wavelet transform separately in the horizontal and vertical directions. This 
approach is called the separable 2D wavelet transform. It is possible to design 
a nonseparable 2D wavelet (see [32], for example), but this generally increases 
computational complexity with little additional coding gain. A sample one- 
scale separable 2D wavelet transform is shown in Figure 12. The 2D data array 
representing the image is first filtered in the horizontal direction, which results in 
two subbands: a horizontal low-pass and a horizontal high-pass subband. These 
subbands are then passed through a vertical wavelet filter. The image is thus 
decomposed into four subbands: LL (low-pass horizontal and vertical filter), LH 
(low-pass vertical and high-pass horizontal filter), HL (high-pass vertical and low- 
pass horizontal filter) and HH (high-pass horizontal and vertical filter). Since 
the wavelet transform is linear, we may switch the order of the horizontal and 
vertical filters yet still reach the same effect. By further decomposing subband 
LL with another 2D wavelet (and iterating this procedure), we derive a multiscale 
dyadic wavelet pyramid. Recall that such a wavelet was illustrated in Figure 4. 
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4.5. Line-based lifting. A trick in implementing the 2D wavelet transform is 
line-based lifting , which avoids buffering the entire 2D image during the vertical 
wavelet lifting operation. The concept can be shown in Figure 13, which is very 
similar to Figure 9, except that here each circle represents an entire line (row) 
of the image. Instead of performing the lifting stage by stage, as in Figure 9, 
line-based lifting computes the vertical low- and high-pass lifting, one line at a 
time. The operation can be described as follows: 

Step 1: Initialization, phase 1. Three lines of coefficients Xo,X\ and X 2 are pro- 
cessed. Two lines of lifting operations are performed, and intermediate results 
x\ and Xq are generated. 
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Figure 13. Line-based lifting wavelet (bi-orthogonal 9-7 wavelet). 

Step 2: Initialization, phase 2. Two additional lines of coefficients X 3 andai 4 are 
processed. Four lines of lifting operations are performed. The outcomes are 
the intermediate results x 3 and x'[, and the first line of low and high-pass 
coefficients L 0 and H 0 . 

Step 3: Repeated processing. During the normal operation, the line based lift- 
ing module reads in two lines of coefficients, performs four lines of lifting 
operations, and generates one line of low and high-pass coefficients. 

Step 4: Flushing. When the bottom of the image is reached, symmetrical bound- 
ary extension is performed to correctly generate the final low and high-pass 
coefficients. 

For the 9-7 bi-orthogonal wavelet, with line-based lifting, only six lines of working 
memory are required to perform the 2D lifting operation. By eliminating the 
need to buffer the entire image during the vertical wavelet lifting operation, the 
cost to implement 2D wavelet transform can be greatly reduced 



5. Quantization and Partitioning 



After the wavelet transform, all wavelet coefficients are uniformly quantized 
according to the rule 



w m ,n = sign s m>n 




where s m> „ is the transform coefficient, w m ^ n is the quantization result, 5 is the 
quantization step size, sign(;r) returns the sign of coefficient x, and |_ J is the 
floor function. The effect of quantization is demonstrated in Figure 14. 
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TRANSFORM COEFF 
4123 , - 12 . 4 , - 96 . 7 , 4 . 5 , - 



QUANTIZE COEFF(Q=l) 
4123 ,- 12 , - 96 , 4 , 



Figure 14. Effect of quantization. 



The quantization process of JPEG 2000 is very similar to that of a conven- 
tional coder such as JPEG. However, the functionality is very different. In a 
conventional coder, since the quantization result is losslessly encoded, the quan- 
tization process determines the allowable distortion of the transform coefficients. 
In JPEG 2000, the quantized coefficients are lossy encoded through an embed- 
ded coder, thus additional distortion can be introduced in the entropy coding 
steps. Thus, the main functionality of the quantization module is to map the 
coefficients from floating representation into integer so that they can be more 
efficiently processed by the entropy coding module. The image coding quality is 
not determined by the quantization step size 8 but by the subsequent bitstream 
assembler. The default quantization step size in JPEG 2000 is rather fine, e.g., 




The quantized coefficients are then partitioned into packets. Each subband is 
divided into non-overlapping rectangles of equal size, as described above, this 
means three rectangles corresponding to the subbands HL, LH, HH of each 
resolution level. The packet partition provides spatial locality as it contains 
information needed for decoding image of a certain spatial region at a certain 
resolution. 

The packets are further divided into non-overlapping rectangular code-blocks, 
which are the fundamental entities in the entropy coding operation. By applying 
the entropy coder to relatively small code-blocks, the original and working data 
of the entire code-blocks can reside in the cache of the CPU during the entropy 
coding operation. This greatly improves the encoding and decoding speed. In 
JPEG 2000, the default size of a code-block is 64 x 64. A sample partition and 
code-blocks are shown in Figure 15. We mark the partition with solid thick 
lines. The partition contains quantized coefficients at spatial location (128, 128) 
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to (255,255) of the resolution 1 subbands LH, HL and HH. It corresponds to 
the resolution 1 enhancement of the image with spatial location (256, 256) to 
(511,511). The partition is further divided into twelve 64 x 64 code-blocks, 
which are shown as numbered blocks in Figure 15. 




Figure 15. A sample partition and code-blocks. 



6. Block Entropy Coding 

Following the partitioning, each code-block is then independently encoded 
through a subbitplane entropy coder. As shown in Figure 16, the input of the 
block entropy coding module is the code-block, which can be represented as a 2D 
array of data. The output of the module is a embedded compressed bitstream, 
which can be truncated at any point and still be decodable, and a rate-distortion 
(R-D) curve (see Figure 16). 

It is the responsibility of the block entropy coder to measure both the coding 
rate and distortion during the encoding process. The coding rate is derived 
directly through the length of the coding bitstream at certain instances, e.g., at 
the end of each subbitplane. The coding distortion is obtained by measuring 
the distortion between the original coefficient and the reconstructed coefficient 
at the same instance. 

JPEG 2000 employs a subbitplane entropy coder. In what follows, we examine 
three key parts of the coder: the coding order, the context, and the arithmetic 
MQ-coder. 

6.1. Embedded coding. Assume that each quantized coefficient w m ^ n is 
represented in the binary form as 



±6i6 2 • • • b., 
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2D Data Array 



Figure 16. Block entropy coding. 



where 61 is the most significant bit (MSB), and b n is the least significant bit 
(LSB), and ± represents the sign of the coefficient. It is the job of the entropy 
coding module to first convert this array of bits into a single sequence of bi- 
nary bits, and then compress this bit sequence with a lossless coder, such as 
an arithmetic coder [22] . A bitplane is defined as the group of bits at a given 
level of significance. Thus, for each codeblock there is a bitplane consisting of all 
MSBs, one of all LSBs, and one for each of the significance levels that occur in 
between. By coding the more significant bits of all coefficients first, and coding 
the less significant bits later, the resulting compressed bitstream is said to have 
the embedding property , reflecting the fact that a bitstream of lower compression 
rate can be obtained by simply truncating a higher rate bitstream, so that the 
entire output stream has embedded in it bitstreams of lower compression that 
still make possible of partial decoding of all coefficients. A sample binary repre- 
sentation of the coefficient can be shown in Figure 17. Since representing bits in 
a 2D block results in a 3D bit array (the 3 rd dimension is bit significance) which 
is very difficult to draw, we only show the binary representation of a column of 
coefficients as a 2D bit array in Figure 17. However, keep in mind that the true 
bit array in a code-block is 3D. 

The bits in the bit array are very different, both in their statistical property 
and in their contribution to the quality of the decoded code-block. The sign 
is obviously different from that of the coefficient bit. The bits at different sig- 
nificance level contributes differently to the quality of the decoded code-blocks. 
And even within the same bitplane, bits may have different statistical property 
and contribution to the quality of decoding. Let 6 m be a bit in a coefficient x. 
If all more significant bits in the same coefficient x are ‘0’s, the coefficient x is 
said to be insignificant (because if the bitstream is terminated at this point or 
before, coefficient x will be reconstructed to zero), and the current bit 6 m is to 
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Figure 17. Coefficients and binary representation. 



be encoded in the mode of significance identification. Otherwise, the coefficient 
is said to be significant, and the bit is to be encoded in the mode of refine- 
ment. Depending on the sign of the coefficient, the coefficient can be positive 
significant or negative significant. We distinguish between significance identifi- 
cation and refinement bits because the significance identification bit has a very 
high probability of being 0, and the refinement bit is usually equally distributed 
between 0 and 1. The sign of the coefficient needs to be encoded immediately 
after the coefficient turns significant, i.e., a first non-zero bit in the coefficient is 
encoded. For the bit array in Figure 17, the significance identification and the 
refinement bits are shown with different shades in Figure 18. 




Figure 18. Embedded coding of bit array. 
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6.2. Context. It has been pointed out [14; 21] that the statistics of significant 
identification bits, refinement bits, and signs can vary tremendously. For exam- 
ple, if a quantized coefficients.^ is of large magnitude, its neighbor coefficients 
may be of large magnitude as well. This is because a large coefficient locates an 
anomaly (e.g., a sharp edge) in the smooth signal, and such an anomaly usually 
causes a cluster of large wavelet coefficients in the neighborhood as well. To 
account for such statistical variation, we entropy encode the significant identifi- 
cation bits, refinement bits and signs with context, each of which is a number 
derived from already coded coefficients in the neighborhood of the current co- 
efficient. The bit array that represents the data is thus turned into a sequence 
of bit-context pairs , as shown in Figure 19, which is subsequently encoded by a 
context adaptive entropy coder. In the bit-context pair, it is the bit information 
that is actually encoded. The context associated with the bit is determined from 
the already encoded information. It can be derived by the encoder and the de- 
coder alike, provided both use the same rule to generate the context. Bits in the 
same context are considered to have similar statistical properties, so that the 
entropy coder can measure the probability distribution within each context and 
efficiently compress the bits. 





Bit: 

Ctx: 



01 10000001000000000 



00 9V0 000007 10 00000000 






Figure 19. Coding bits and contexts. The context is derived from information 
from the already coded bits. 



In the following, we describe the contexts that are used in the significant 
identification, refinement and sign coding of JPEG 2000. For the rational of 
the context design, we refer to [2; 19]. Determining the context of significant 
identification bit is a two-step process: 

Step 1: Neighborhood statistics. For each bit of the coefficient, the number of 
significant horizontal, vertical and diagonal neighbors are counted as h,v and 
d, as shown in Figure 20. 

Step 2: Lookup table. According to the direction of the subband that the co- 
efficient is located (LH, HL, HH), the context of the encoding bit is indexed 
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Table 1. Context for the significance identification coding. 

through one of the three tables shown in Table 1. A total of nine context cate- 
gories are used for significance identification coding. The table lookup process 
reduces the number of contexts and enables probability of the statistics within 
each context to be quickly obtained. 



Figure 20. Number of 
diagonal ( d ). 




To determine the context for sign coding, we calculate a horizontal sign count 
hand a vertical sign count v. The sign count takes a value of —1 if both hori- 
zontal/vertical coefficients are negative significant; or one coefficient is negative 
significant, and the other is insignificant. It takes a value of +1 if both hori- 
zontal/vertical coefficients are positive significant; or one coefficient is positive 
significant, and the other is insignificant. The value of the sign count is 0 if both 
lrorizontal/vertical coefficients are insignificant; or one coefficient is positive sig- 
nificant, and the other is negative significant. 

With the horizontal and vertical sign count h and v, an expected sign and a 
context for sign coding can then be calculated according to Table 2. 

To calculate the context for the refinement bits, we measure if the current 
refinement bit is the first bit after significant identification, and if there is any 
significant coefficients in the immediate eight neighbors, i.e., h + v + d > 0. The 
context for the refinement bit is tabulated in Table 3. 
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Table 2. Context and the expected sign for sign coding. 



Context 14: Current refinement bit is the first bit after significant identifi- 
cation and there is no significant coefficient in the eight neighbors. 

Context 15: Current refinement bit is the first bit after significant identifica- 
tion and there is at feast one significant coefficient in the eight neighbors. 

Context 16: Current refinement bit is at feast two bits away from significant 
identification. 



Table 3. Context for the refinement bit. 

6.3. MQ-coder: context dependent entropy coder. Through the afore- 
mentioned process, a data array is turned into a sequence of bit-context pairs, as 
shown in Figure 19. All bits associated with the same context are assumed to be 
independently and identically distributed. Let the number of contexts be N, and 
let there be n,; bits in context i, within which the probability of the bits taking 
value 1 is pi. Using classic Shannon information theory [15; 16] the entropy of 
such a bit-context sequence can be calculated as 



N-l 

H= '^2n i (-plog 2 p i -(l-pi)log 2 (l-p i )). (6-1) 

i = 0 

The task of the context entropy coder is thus to convert the sequence of bit- 
context pairs into a compact bitstream representation with length as close to the 
Shannon limit as possible, as shown in Figure 21. Several coders are available for 
such task. The coder used in JPEG 2000 is the MQ-coder. In the following, we 
focus the discussion on three key aspects of the MQ-coder: general arithmetic 
coding theory, fixed point arithmetic implementation and probability estimation. 
For more details, we refer to [22; 23]. 



BITS 

CTX 



■> 

■> 



MQ-Coder 



» 



BITSTREAM 



Figure 21. Input and output of the MQ-coder. 
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6.3.1. The Elias coder. The basic theory of the MQ-coder can be traced to the 
Elias Coder [24], or recursive probability interval subdivision. Let S 0 S 1 S 2 ■ ■ ■ S n 
be a series of binary bits that is sent to the arithmetic coder. Let Pj be the 
probability that the bit Si be 1. We may form a binary representation (the 
coding bitstream) of the original bit sequence by the following process: 

Step 1: Initialization. Let the initial probability interval be (0, 1). We denote 
the current probability interval as (C,C+A), where C is the bottom of the 
probability interval, and A is the size of the interval. At the initialization, we 
have C = 0 and A = 1. 

Step 2: Probability interval subdivision. The binary symbols S 0 S 1 S 2 ■ ■ ■ S n are 
encoded sequentially. For each symbol Si, the probability interval (C, C+A) is 
subdivided into two sub-intervals (C, C+A{1— Pj)) and (C+A(l— Pi), C+A). 
Depending on whether the symbol Si is 1, one of the two subintervals is 
selected: 

/ C<-C , A^A(1-Pi), if S, = 0, 

\<7<- A(l-Pj), A<-AP t , if Si = 1. 1 J 




Figure 22. Probability interval subdivision. 

Step 3: Bitstream output. Let the final coding bitstream be k\k ,2 ■ ■ ■ k m , where m 
is the compressed bitstream length. The final bitstream creates an uncertainty 
interval where the lower and upper bound can be determined as 

Upperbound D = OP 1 /C 2 • • • fc m lll ■ • ■ , 

Lowerbound B = O.fcife • • • fc m 000 .... 

As long as the uncertainty interval ( B , D ) is contained in the probability in- 
terval ( C , C+A), the coding bitstream uniquely identifies the final probability 
interval, and thus uniquely identifies each subdivision in the Elias coding pro- 
cess. The entire binary symbol strings S 0 S 1 S 2 ■ ■ ■ S n can thus be recovered 
from the compressed representation. It can be shown that it is possible to 
find a final coding bitstream with length 

m < \— log 2 A] + 1 
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to represent the final probability interval ( C,C+A ). Notice that A is the 
probability of the occurrence of the binary strings SqS'i S2 . . . S n , and the 
entropy of the original symbol stream can be calculated as 

H = ^2 ~A log 2 A. 

SoSx-Sn 

The arithmetic coder thus encodes the binary string within 2 bits of its entropy 
limit, no matter how long the symbol string is. This is very efficient. 

6.3.2. The arithmetic coder: finite precision arithmetic operations. Exact im- 
plementation of Elias coding requires infinite precision arithmetic, an unrealistic 
assumption in real applications. Using finite precision, the arithmetic coder is 
developed from Elias coding. Observing the fact that the coding interval A be- 
comes very small after a few operations, we may normalize the coding interval 
parameter C and A as 



C = 1.5 • [O.Aqfca • • • k L ] + 2~ L ■ 1.5 • C x , A = 2~ L • 1.5 • A x , 

where L is a normalization factor determining the magnitude of the interval A, 
while A x and C x are fixed-point integers representing values between (0.75, 1.5) 
and (0, 1.5), respectively. Bits k\k- 2 - ■ ■ k m are the output bits that have already 
been determined (in reality, certain carryover operations have to be handled 
to derive the true output bitstream). By representing the probability interval 
with the normalization L and fixed-point integers A x and C x , it is possible 
to use fixed-point arithmetic and normalization operations for the probability 
interval subdivision operation. Moreover, since the value of A x is close to 1.0, 
we may approximate A x ■ Pi with Pi, the interval sub-division operation (6-2) 
calculated as 



C x = CJ X , A x =A X - P t , if Si = 0, 

C x = C + A x — Pi, A x = Pi, if Si = 1, 

which can be done quickly without any multiplication. The compression perfor- 
mance suffers a little, as the coding interval now has to be approximated with a 
fixed-point integer, and A x ■ Pi is approximated with Pj. However, experiments 
show that the degradation in compression performance is less than three percent, 
which is well worth the saving in implementation complexity. 

6.3.3. Probability estimation. In the arithmetic coder it is necessary to estimate 
the probability P,; for each binary symbol S) to take the value 1. This is where 
context comes into play. Within each context, it is assumed that the symbols 
are independently identically distributed. We may then estimate the probability 
of the symbol within each context through observation of the past behaviors of 
symbols in the same context. For example, if we observe rq symbols in context 
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i, with o.j symbols to be 1, we may estimate the probability that a symbol takes 
on the value 1 in context i through Bayesian estimation as 

= Qj + 1 
* n.i + 2 ' 

In the MQ-coder [22], probability estimation is implemented through a state- 
transition machine. It may estimate the probability of the context more effi- 
ciently, and may take into consideration the non-stationary characteristic of the 
symbol string. Nevertheless, the principle is still to estimate the probability 
based on past behavior of the symbols in the same context. 

6.4. Coding order: subbitplane entropy coder. In JPEG 2000, because 
the embedded bitstream of a code-block may be truncated, the coding order, 
which is the order that the data array is turned into bit-context pair sequence, 
is of paramount importance. A sub-optimal coding order may allow important 
information to be lost after the coding bitstream is truncated, and lead to severe 
coding distortion. It turns out that the optimal coding order first encodes those 
bits with the steepest rate-distortion slope, which is defined as the coding dis- 
tortion decrease per bit spent [21]. Just as the statistical properties of the bits 
are different in the bit array, their contribution of the coding distortion decrease 
per bit is also different. 

Consider a bit bi in the *-th most significant bitplane, where there are a total 
of n bitplanes. If the bit is a refinement bit, then previous to the coding of 
the bit, the uncertainty interval of the coefficient is (A, A+2 n ~ l ). After the 
refinement bit has been encoded, the coefficient lies either in (A, A+2 n ~ t ~ 1 ) or 
in (A+2 n ~‘ l , A+2 n ~ t ~ 1 ). If we further assume that the value of the coefficient is 
uniformly distributed in the uncertainty interval, we may calculate the expected 
distortion before and after the coding as 

r'A+2 n ~ i 

Upre.REF = / (x — A — 2 n ~ i ~ 1 ) 2 dx = ± 4"“\ 

J A 

D 1 A n—i— 1 

-Impost, REF 12 ^ 

Since the value of the coefficient is uniformly distributed in the uncertainty 
interval, the probability for the refinement bit to take the values 0 and 1 is equal, 
thus, the coding rate of the refinement bit is: 

Rref = H{bi) = 1 bit. (6-3) 



The rate-distortion slope of the refinement bit at the i-tli most significant 
bitplane is thus: 



sref(«) = 



^prev,REF ^Jpost,REF 



1 ATl — i _X_ ATl — i — 1 

12 * 12* = 4 n-i- 2 (g_ 4 ) 



-Rref 1 

In the same way, we may calculate the expected distortion decrease and coding 
rate for a significant identification bit at the i-tli most significant bitplane. Before 
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the coding of the bit, the uncertainty interval of the coefficient ranges from —2 n ~ l 
to 2 n ~ l . After the bit has been encoded, if the coefficient becomes significant, 
it lies in (— 2 n ~ l , — 2 n ~ l ~ 1 ) or (+2 ra_i_1 , +2 ra_l ) depending on the sign of the 
coefficient. If the coefficient is still insignificant, it lies in (— 2” - * -1 , 2 n ~ l ~ 1 ). We 
note that if the coefficient is still insignificant, the reconstructed coefficient before 
and after coding both will be 0, which leads to no distortion decrease (coding 
improvement). The coding distortion only decreases if the coefficient becomes 
significant. Assuming the probability that the coefficient becomes significant is 
p, and the coefficient is uniformly distributed within the significance interval 
(— 2 n ~ l , — 2™ _1 ) or (+2 n ~ 1 ~ 1 , +2 n ~ l ), we may calculate the expected coding 
distortion decrease as 

-Dprev.SIG — -Dpost, SIG = P ^ 4" _I (6~5) 

The entropy of the significant identification bit can be calculated as 
#SiG = -(1 ~P) log 2 (l ~p) -p\og 2 p + p • 1 = p + H(p), 



where H(p) = — (1 — p) log 2 (l —p) — plog 2 p is the entropy of the binary symbol 
with the probability of 1 being p. In (6-5), we account for the one bit which is 
needed to encode the sign of the coefficient if it becomes significant. 

We may then derive the expected rate-distortion slope for the significance 
identification bit coding as 



SsigW 



-^prev,SIG -^post,SIG 

-Rsig 



^n—i — 2 

1 + H(p)/p 



From this and (6-4), we arrive at the following conclusions: 



Conclusion 1. The more significant bitplane that the bit is located, the earlier 
it should be encoded. 



A key observation is, within the same coding category (significance identifi- 
cation/refinement), one more significance bitplane translates into 4 times more 
contribution in distortion decrease per coding bit spent. Therefore, the code- 
block should be encoded bitplane by bitplane. 

Conclusion 2. Within the same bitplane, we should first encode the significance 
identification bit with a higher probability of significance. 

It can be shown that the function H(p)/p increases monotonically as the 
probability of significance decreases. As a result, the higher probability of sig- 
nificance, the higher contribution of distortion decrease per coding bit spent. 

Conclusion 3. Within the same bitplane, the significance identification bit 
should be encoded earlier than the refinement bit if the probability of significance 
is higher than 0.01. 
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It is observed that the insignificant coefficients with no significant coefficients 
in its neighborhood usually have a probability of significance below 0.01, while 
insignificant coefficients with at least one significant neighbor usually have a 
higher probability of significance. 

As a result of these three conlusions, the entropy coder in JPEG 2000 en- 
codes the code-block bitplane by bitplane, from the most significant bitplane to 
the least significant bitplane; and within each bitplane, the bit array is further 
ordered into three subbitplanes: the predicted significance (PS), the refinement 
(REF) and the predicted insignificance (PN). 

Using the data array in Figure 23 as an example, we illustrate the block coding 
order of JPEG 2000 with a series of sub-figures in Figure 23. Each sub-figure 
shows the coding of one subbitplane. The block coding order of JPEG 2000 is 
as follows: 

Step 1: The most significant bitplane, the PN subbitplane of b\, (See Fig- 
ure 23(a).) 

First, the most significant bitplane is examined and encoded. Since at first, 
all coefficients are insignificant, all bits in the MSB bitplane belong to the PN 
subbitplane. Whenever a 1 bit is encountered (rendering the corresponding 
coefficient non-zero) the sign of the coefficient is encoded immediately after- 
wards. With the information of those bits that have already been coded and 
the signs of the significant coefficients, we may figure out an uncertain range 
for each coefficient. The reconstruction value of the coefficient can also be 
set, e.g., at the middle of the uncertainty range. The outcome of our sam- 
ple bit array after the coding of the most significant bitplane is shown in 
Figure 23(a). We show the uncertainty range and the reconstruction value 
of each coefficient under columns “value” and “range” in the sub-figure, re- 
spectively. As the coding proceeds, the uncertainty range shrinks, and brings 
better and better representation to each coefficient. 

Step 2: The PS subbitplane of &2- (See Figure 23(b).) 

After all bits in the most significant bitplane have been encoded, the coding 
proceeds to the PS subbitplane of the second most significant bitplane (62). 
The PS subbitplane consists of bits of the coefficients that are not significant, 
but has at least one significant neighbor. The corresponding subbitplane cod- 
ing is shown in Figure 23(b). In this example, coefficients wo and u >2 are the 
neighbors of the significant coefficient W\ , and they are encoded in this pass. 
Again, if a 1 bit is encountered, the coefficient becomes significant, and its 
sign is encoded right after. The uncertain ranges and reconstruction value of 
the coded coefficients are updated according to the newly coded information. 
Step 3: The REF subbitplane of 62. (See Figure 23(c).) 

The coding then moves to the REF subbitplane, which consists of the 
bits of the coefficients that are already significant in the past bitplane. The 
significance status of the coefficients is not changed in this pass, and no sign 
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of coefficients is encoded. 

Step 4: The PN subbitplane of b 2 - (See Figure 23(d).) 

Finally, the rest of the bits in the bitplane are encoded in the PN subbit- 
plane pass, which consists of the bits of the coefficients that are not significant 
and have no significant neighbors. Sign is again encoded once a coefficient 
turns into significant. 



Steps 2, 3, and 4 are repeated for the following bitplanes, with the subbitplane 
coding ordered being PS, REF and PN for each bitplane. The block entropy 
coding continues until certain criteria, e.g., the desired coding rate or coding 
quality has been reached, or all bits in the bit array have been encoded. The 
output bitstream has the embedding property. If the bitstream is truncated, the 
more significant bits of the coefficients can still be decoded. An estimate of each 
coefficient is thus obtained, albeit with a relatively large uncertain range. 
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Figure 23. Order of coding: (a) Bitplane bi, subbitplane PN, then bitplane 62 , 
subbitplanes (b) PS, (c) REF and (d) PN. 



7. The Bitstream Assembler 

The embedded bitstream of the code-blocks are assembled by the bitstream 
assembler module to form the compressed bitstream of the image. As described 
in section 6 , the block entropy coder not only produces an embedded bitstream 
for each code-block i, but also records the coding rate R* and distortion D £ 
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at the end of each subbitplane, where k is the index of the subbitplane. The 
bitstream assembler module determines how much bitstream of each code-block 
is put to the final compressed bitstream. It determines a truncation point n, for 
each code-block so that the distortion of the entire image is minimized upon a 
rate constraint: 



min D n H , 

i 



with Y R n 'i<B . 
i 



(7-1) 



Since there are a discrete number of truncation points n,, the constraint min- 
imization problem of equation (7-1) can be solved by distributing bits first to 
the code-blocks with the steepest distortion per rate spent. The process of bit 
allocation and assembling can be performed as follows: 



Step 1: Initialization. We initialize all truncation points to zero: n, = 0. 

Step 2: Incremental bit allocation. For each code block i, the maximum possible 
gain of distortion decrease per rate spent is calculated as 



Si = max 

k~>ni 



D n, _ jjk 

rt ■ 



We call Si the rate-distortion slope of the code-block i. The code-block 
with the steepest rate-distortion slope is selected, and its truncation point is 
updated as 



n" ew = arg k>r 



D n H - D k i 

Rk _ Rn, i 



= Si 



A total of R r - ‘ — i?”’ bits are sent to the output bitstream. This leads to 

1 1 ^new 

a distortion decrease of — D. i 1 . It can be easily proved that this is the 

maximum distortion decrease achievable for spending i?."‘ — bits. 

Step 3: Repeat Step 2 until the required coding rate B is reached. 

The above optimization procedure does not take into account the last seg- 

n n ew 

ment problem, i.e., when the coding bits available is smaller than R t ' — R™* 

bits. However, in practice, usually the last segment is very small (within 100 
bytes), so that the residual sub-optimally is not a big concern. 



Following exactly the optimization procedure above is computationally complex. 
The process can be speeded up by first calculating a convex hull of the R.-D slope 
of each code-block i, as follows: 



Step 1: Set S to the set of all truncation points. 
Step 2: Set p to the first truncation point in S. 
Step 3: Do until p is the last truncation point in S: 



(i) Set k to the next truncation point after p in S. 



(ii) Set Sf 



R k i - R\ ' 
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(iii) If p is not the first truncation point in S and S* > Sf , remove p from S 
and move p back one truncation point in S ; otherwise, set p = k. 

(iv) [End of current iteration. Restart at step 3(i), unless p is the last trun- 
cation point in S'.] 

Once the R-D convex hull is calculated, the optimal R-D optimization becomes 
simply the search of a global R.-D slope A, where the truncation point of each 
code-block is determined by: 



rii = argmax 



(# > A) 



Putting the truncated bitstream of all code-blocks together, we obtain a com- 
pressed bitstream associated with each R.-D slope A. To reach a desired coding 
bitrate B , we just search the minimum A whose associated bitstream satisfies 
the rate inequality (7-1). The R.-D optimization procedure can be illustrated in 
Figure 24. 




Figure 24. Bitstream assembler: for each R-D slope A, a truncation point can 
be found at each code-block. The slope A should be the minimum slope that 
the allocated rate for all code-blocks is smaller than the required coding rate B. 



To form a compressed image bitstream with progressive quality improvement 
property, so that we may gradually improve the quality of the received im- 
age as more and more bitstream arrives, we may design a series of rate points, 
B^ 2 \ . . . , B^ n \ A sample rate point set is 0.0625, 0.125, 0.25, 0.5, 1.0 and 
2.0 bpp (bit per pixel). For an image of size 512 x 512, this corresponds to a 
compressed bitstream size of 2k, 4k, 8k, 16k, 32k and 64k bytes. First, the global 
R.-D slope A - 1 1 for rate point B ^ is calculated. The first set of truncation point 
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of each code-block n-os thus derived. These bitstream segments of the code- 
blocks of one resolution level at one spatial location is grouped into a packet. All 
packets that consist of the first segment bitstream form the first layer that rep- 
resents the first quality increment of the entire image at full resolution. Then, 
we may calculate the second global R-D slope corresponding to the rate 
point B^ 2 \ The second truncation point of each code-block rif^ can be derived, 
and the bitstream segment between the first n,- 1 ^ and the second n| 2 ' ) truncation 
points constitutes the second bitstream segment of the code-blocks. We again 
assemble the bitstream of the code-blocks into packets. All packets that consist 
of the second segment bitstreams of the code-blocks form the second layer of the 
compressed image. The process is repeated until all n layers of bitstream are 
formed. The resultant JPEG 2000 compressed bitstream is thus generated and 
can be illustrated with Figure 25. 




Packet 
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Body 
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Packet 
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</) 








CX 
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Body 
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Layer n | 
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Figure 25. JPEG 2000 bitstream syntax. SOC = start of image (codestream) 
marker; SOT = start of tile marker; SOS = start of scan marker; EOI = end of 
image marker. 



8. The Performance of JPEG 2000 

Finally, we briefly demonstrate the compression performance of JPEG 2000. 
We compare JPEG 2000 with the traditional JPEG standard. The test image 
is the “Bike” standard image (gray, 2048 x 2560), shown in Figure 26. Three 
modes of JPEG 2000 are tested, and are compared against two modes of JPEG. 
The JPEG modes are progressive (P-DCT) and sequential (S-DCT) both with 
optimized Huffman tables [4]. The JPEG 2000 modes are single layer with the 
bi-orthogonal 9-7 wavelet (S-9,7), six layer progressive with the bi-orthogonal 9-7 
wavelet (P6-9,7), and 7 layer progressive with the (3,5) wavelet (P7-3,5). The 
JPEG 2000 progressive modes have been optimized for 0.0625, 0.125, 0.25, 0.5, 
1.0, 2.0 bpp and lossless for the 5x3 wavelet. The JPEG progressive mode uses 
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a combination of spectral refinement and successive approximation. We show 
the performance comparison in Figure 27. 




Figure 26. Original “Bike” test image. 

JPEG 2000 results are significantly better than JPEG results for all modes 
and all bit-rates on this image. Typically JPEG 2000 provides only a few dB 
improvement from 0.5 to 1.0 bpp but substantial improvement below 0.25 bpp 
and above 1.5 bpp. Also, JPEG 2000 achieves scalability at almost no additional 
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Figure 27. Performance comparison: JPEG 2000 versus JPEG. From [1], cour- 
tesy of the authors, Marcellin et al. 

cost. The progressive performance is almost as good as the single layer JPEG 
2000 without the progressive capability. The slight difference is due solely to 
the increased signaling cost for the additional layers (which changes the packet 
headers). It is possible to provide “generic rate scalability” by using upwards of 
fifty layers. In this case the “scallops” in the progressive curve disappear, but 
the overhead may be slightly increased. 
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Abstract. This article presents a simple version of Integrated Sensing and 
Processing (ISP) for statistical pattern recognition wherein the sensor mea- 
surements to be taken are adaptively selected based on task- specific metrics. 
Thus the measurement space in which the pattern recognition task is ul- 
timately addressed integrates adaptive sensor technology with the specific 
task for which the sensor is employed. This end-to-end optimization of sen- 
sor/processor/exploitation subsystems is a theme of the DARPA Defense 
Sciences Office Applied and Computational Mathematics Program’s ISP 
program. We illustrate the idea with a pedagogical example and applica- 
tion to the HyMap hyperspectral sensor and the Tufts University “artificial 
nose” chemical sensor. 



1. Introduction 

An important activity, common to many fields of endeavor, is the act of refin- 
ing high order information (detections of events, classification of objects, identifi- 
cation of activities, etc.) from large volumes of diverse data which is increasingly 
available through modern means of measurement, communication, and process- 
ing. This exploitation function winnows the available data concerning an object 
or situation in order to extract useful and actionable information, quite often 
through the application of techniques from statistical pattern recognition to the 
data. This may involve activities like detection, identification, and classification 
which are applied to the raw measured data, or possibly to partially processed 
information derived from it. 

When new data are sought in order to obtain information about a specific 
situation, it is now increasingly common to have many different measurement 
degrees of freedom potentially available for the task. Some appreciation of the 
dimensionality of available data can be obtained by considering measurements 
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from one sensor, the hyperspectral camera, which is gaining broad application 
in fields ranging from geological remote sensing to military target identification. 
This sensor produces an output comprised of hundreds of megapixel images of 
a scene, each image corresponding to the appearance of that scene in light from 
a narrow band of frequencies. Taken together, these images present a finely 
resolved spectrum for each pixel in the scene. The data sets are often presented 
as cubes and can have on the order of a billion voxels per scene. Of course for 
real scenes, the billions of degrees of freedom exhibit correlations; nevertheless, 
the raw data is presented in an overwhelmingly high dimensional space. 

This situation is magnified when one considers the diversity of sophisticated 
sensing mechanisms which might be applied to a given task. For example, re- 
mote sensing of terrain may be performed with natural light cameras, infrared 
cameras, hyperspectral imagers, fully polarimetric imaging radar, or combina- 
tions of all of these. This gives us many different views of the scene, but also 
presents a challenging requirement for effective processing and exploitation al- 
gorithms enabling reliable and affordable extraction of information from the 
high-dimensional spaces of sensed data. 

In many situations, constraints on the available time, bandwidth, human and 
machine resources, and on the prior relevant experience all significantly limit the 
ability to deal intelligently with the many potential sensing degrees of freedom. 
This is particularly the case in time-critical applications. In fact, one often 
finds that not all of the available sensor degrees of freedom are equally useful 
in a given situation, suggesting the need for a reasoned approach for choosing 
those particular measurement types to be made and/or communicated and/or 
processed. 

In this paper we show that it is sometimes possible to identify a particu- 
larly informative subspace of the space of all possible sensor measurements when 
it comes to the application of exploitation tasks on the sensed data. We will 
present examples in which performance is enhanced significantly by finding and 
working in the corresponding reduced-dimensionality subspace of sensed data. 
Even more, we will demonstrate in several cases that the determination of this 
particularly informative subspace then suggests the selection of a further sub- 
space of measurements to improve exploitation performance yet further. This is 
somewhat analogous to the game of “20 questions,” in which we progressively 
refine the scope and specificity of our questions based on partial understanding 
derived from previous attempts to narrow down the possibilities. 

This process of focusing and targeting measurements is in fact often realizable 
in practice, due in part to significant engineering advances made in adaptive 
“smart” sensor technology. Current and projected capabilities for modifying 
the way certain important sensors look at the world motivate the development 
of mathematical methodology for guiding the adaptive selection of the types 
measurements made by an adaptive sensor/processor subsystem with an eye to 
enhancing and simplifying the exploitation of the resulting data. We present 
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examples in which the way a sensor views a scene determines the abstract space 
in which the exploitation is ultimately addressed. In these cases, a judicious 
choice of sensor viewpoint improves exploitation performance dramatically. 

Effective realization of the next generation of sensor/exploitation systems will 
require balanced integration and joint optimization of adaptive sensor front end 
functions with the pattern recognition tasks applied to sensor measurements 
in the system’s back end. Development of methodologies for end-to-end joint 
optimization of sensor/processor/exploitation subsystems with respect to task- 
specific metrics, is a key theme of the DARPA Applied and Computational 
Mathematics Program’s “Integrated Sensing and Processing” (ISP) effort. Var- 
ious aspects of this program are currently being pursued by several groups of 
researchers in academia, industry, and government. Preliminary results suggest 
that certain applications in target detection and identification may derive signif- 
icant performance enhancements by applying this concept to take full advantage 
of adaptive sensor technology. 

In this paper, we illustrate one aspect of the ISP idea, in which the ex- 
ploitation subsystem is concerned with supervised statistical pattern recogni- 
tion (classification) and the observations take their value in a space with some 
linear ordering properties, such as multivariate time series. We illustrate the 
idea with a pedagogical example and application to the HyMap hyperspectral 
sensor (in which case the functional domain is spectral rather than temporal) 
and the Tufts University “artificial nose” chemical sensor. Other applications 
include gene expression analysis via DNA microarrays collected at multiple time 
instances, functional brain imaging collected at multiple time instances, etc. 

2. Statistical Pattern Recognition 

Pattern recognition starts with observations and returns class labels. Sta- 
tistical pattern recognition addresses the problem in a probabilistic framework 
and applies to it statistical methods. Here we provide a brief description of the 
basic set up of statistical pattern recognition. For additional details, see, e.g., 
Fukunaga (1990), Devroye et al. (1996), Duda et al. (2000), Hastie et al. (2001), 
and references therein. 

Let the pair (X,Y) be distributed according to probability distribution F; 
(X, Y) ~ F. Intuitively, X represents measurements made on some phenomenon 
of interest and Y indicates higher order information about that phenomenon, 
such as its membership in one of several disjoint classes. 

More formally, the feature vector X is a S-valued random variable. Usually 
H = or some subset thereof. More generally, S may allow for more elabo- 
rate data structures such as multivariate time series, images, categorical data, 
dissimilarity data, etc. We will consider cases in which feature observations are 
multivariate time series and spectral responses. For categorical data S is simply 
a set (unordered). In some applications, H may consist of mixed data — some 
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categorical, some continuous and some time series. For example, in a medical 
application one might have sex (categorical), temperature (continuous), and an 
EKG (time series). 

The class label Y is a {1, . . . , J}-valued random variable, with J > 1 usually 
finite. The label Y indicates the class to which the associated feature vector X 
belongs. The prior probabilities of class membership are given by nj := P[Y = j] . 
We denote by Fj the class- conditional distributions of X\Y = j. 

We partition statistical pattern recognition into two main categories: super- 
vised and unsupervised. The distinguishing feature between these two categories 
is that for supervised pattern recognition training data exist for which the class 
labels Y are observed, while this is not the case in the unsupervised case. We 
refer to the supervised case as classification and the unsupervised case as clus- 
tering. 

2.1. Classification. In the supervised case, training data are available. The 
training data set is given by V n := {(Xi, Y\), . . . , ( X n , Y n )}' . That is, we have 
available observations for which the true categorization is known. The goal is to 
develop a classifier g which will take an unlabelled feature vector X, with true 
but unobserved class label Y, and estimate its class label by Y = g{X). We 
hope that Y = Y with high probability. Obviously, g should use the available 
training data and will have functional dependence on the particular observed 
training data set as well as on the measured features we are trying to classify; 
thus 

g : S x (H x {1,...,J})" — > {1,...,J}. 

The use of training data to build the classifier is referred to as training. 

In order for statistical pattern recognition methodologies to have any guaran- 
tee of success, we must assume that the training data are representative. Usu- 
ally this means that (Xj,Y)) ~ F. Alternatively, writing I{E} as the indicator 
function for event E, the class- conditional sample sizes given by Nj( V n ) := 
]U" =1 1{Yi = j} may be design variables rather than random variables, in which 
case the conditional random variables Xj \ Y t = j are independent and identically 
distributed (iid) according to the class-conditional distributions Fj. In the for- 
mer case the class-conditional sample sizes Nj{V n ) yield consistent estimates of 
the priors — 7tj(V n ) := Nj(V n )/n — > TTj almost surely as n — * oo. In the latter 
case a priori knowledge of the prior probabilities must be assumed. 

Given a training data set V n , the probability of mis classification for classifier 
g is given by 

L(g\V n ) ~P[g{X-V n )^Y \V n \. 

The Bayes optimal probability of mis classification is given by 
L*= min P\g(X)^Y]- 
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notice that for the purposes of defining this bound, we consider classifiers which 
are not constrained by a particular training set. A Bayes rule is any map g* 
with L(g*) = L*. The Bayes rule can be obtained from the class-conditional 
distributions Fj and the prior probabilities 7 ij as 

g*(x) = argmax7rj dFj{x). 

3 

Notice that g* depends on the distribution of (X,Y), but not on the training 
data set. 

The goal of classification, then, is to devise a methodology for taking training 
data V n and constructing a classifier g such that L(g\T> n ) is as close to L* as 
possible. In particular, we desire consistency : L(g]V n ) — > L* as n — » 00 (in 
probability or with probability one). 

2.2. The curse of dimensionality. A common misconception in statistical 
pattern recognition is that “more is better”. It is intuitively obvious — and 
wrong — that if ten features per observation are good then a hundred features 
are even better. This is a result of one manifestation of the so-called curse of 
dimensionality (Bellman (1961), Scott (1992)). 

The curse has several manifestations. Silverman (1986) considers probability 
density function estimation, and provides a table for the number of observations 
needed to obtain a point estimate with a given accuracy as the dimension in- 
creases. The estimator considered is a nonparametric one, the kernel estimator. 
It is shown that the number of observations required grows from 4 for univariate 
data to over 800,000 for ten-dimensional data. Thus, to achieve a given accu- 
racy for a kernel estimator at a single point, the required number of observations 
grows exponentially in the dimension. 

Another consequence of the curse of dimensionality is discussed in Scott 
(1992), where he points out statistical ramifications of the fact that the vol- 
ume of a cube in high dimensions resides primarily in the corners, the volume 
of a sphere resides mostly near the boundary. This is shown by comparing the 
volume of a sphere with radius r to that of an interior sphere of radius r — e, 
and noting that for arbitrarily small e > 0 the appropriate ratio of volumes goes 
to 0 as dimensionality goes to infinity, indicating that essentially none of the 
volume resides in the interior sphere. That is, “high-dimensional space is mostly 
empty” , which in turn suggests that required sample size for fixed performance 
grows (rapidly) with dimension. (See also Silverman (1986), Table 4.2.) 

Jain et al. (2000) discusses another aspect of the curse, first described by 
Trunk (1979). It is shown that in the simple case of two d-dimensional multi- 
variate normals with equal (known) identity covariances, known priors 7 Tj = 1/2, 
and means 

n — (_i)i [1 _L _L _Ll 
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for classes j = 1, 2, the probability of error for the linear classifier — the classifier 
which labels an observation as belonging to the class associated with the nearest 
of the two class-conditional sample means — goes to 0 as d — + oo if the means 
are known, but this probability of error converges to i if the means must be 
estimated from any training sample of (arbitrarily large but) fixed size. In other 
words, adding variates that each decrease the Bayes error can actually increase 
the classification error when estimates must be used rather than the (unknown) 
truth. 



2.3. Classifiers. Assume for simplicity that the class-conditional probabil- 
ity density functions fj exist. Then any density estimator fj yields a plug-in 
classification rule: 

g(x) = argma xTt j (V n )f j (x;D n ). 

j 

For iid training data the class conditional sample sizes, 7 fj, are consistent esti- 
mators for the priors; if in addition a density estimator is employed for which 
fj — > fj in L\ or L 2 a.s., for instance, then Lfg\D n ) — ► L* a.s. 

Density estimation comes in two basic flavors, parametric and nonparamet- 
ric. (We categorize “semiparametric” with nonparametric for the purposes of 
this discussion.) Parametric density estimation assumes that a parameterized 
functional form for the class-conditional densities fj is known and focuses on es- 
timating the (few) unknown parameters. Nonparametric methods, on the other 
hand, make no such parametric assumption. Parametric density estimation is 
an easier problem — rates of convergence are faster, for example — due to the 
fact that the target is finite dimensional. Of course, if the assumed parametric 
form is not correct, a parametric approach will not in general yield consistent 
classification. Nonparametric methods provide a more general guarantee of con- 
sistency, at a price of reduced efficiency if indeed a simple parametric form is 
appropriate. Classical examples of these two categories, which allow for a fruitful 
“compare and contrast” exercise, are given by finite mixture models (McLaclrlan 
and Krishnan (1997)) versus kernel estimators (Silverman (1986)). 

Density estimation is, however, quite expensive in high dimensions (curse of di- 
mensionality). Thus, for multivariate feature vectors in particular, there is much 
interest in developing applicable classification methodologies which somehow re- 
duce this cost. One approach involves preprocessing to yield reduced dimension- 
ality without seriously degrading classification performance. Thus, one might 
choose a projection IP : H — > , where d! = 1 or 2, say, and consider classifica- 

tion, as above, using [(P(Xi), Yi), . . . , (F(X n ),Y n )]' as the transformed training 
data. See, for instance, principal component analysis, independent component 
analysis, linear discriminant analysis, and projection pursuit. These techniques 
can be found in standard multivariate statistics texts such as Seber (1984), Mar- 
dia et al. (1995), Johnson and Wichern (1998), and in pattern recognition texts 
such as Fukunaga (1990), Duda et al. (2000), and Hastie et al. (2001). 
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Consideration of the maxim “classification is easier than density estimation” 
suggests that instead of trying to estimate the probability densities, one might 
choose to estimate the decision region directly. This, too, can be done paramet- 
rically or nonparametrically. 

The simplest decision region is a linear one, and several methods involve either 
estimating the best linear separator of the data or extending to piecewise linear 
discriminators. See for example Sklansky and Wassel (1979). 

A popular nonparametric method is the nearest neighbor classifier (and its 
extension, the k- nearest neighbor classifier). The idea is simple, yet powerful: 
choose the category associated with the nearest element of the training set. Given 
a training set V n = {(Ad, Yi), . . . , (X n , Y n )}', the nearest neighbor classifier g nn 
is defined to be 

3nn{,%i'D n ) Y ar g min{p(x ,Xi) } , 

where p : S x S — > [0, oo) is a distance function. This classifier has been studied 
widely “simple rules survive!” and is a standard against which new classifiers 
are often tested. 

It is well known that the nearest neighbor rule has asymptotic error bounded 
above by 2 L*. This means that if the classes are strictly separable, so that 
L* = 0, then the nearest neighbor classifier is consistent. 

The fc-nearest neighbor classifier is an obvious extension. Rather than con- 
sidering only the nearest observation, consider the k nearest elements of the 
training set. A simple vote is taken amongst the classes. (More complicated 
voting schemes have been investigated.) 

Denoting the k- nearest neighbor classifier by gk, the following theorem of 
Stone (1977) establishes the universal consistency of this classifier. 

Theorem. Given iid training data V n , if k — > oo and k/n — > 0 then 

EL(gk]'D n ) — * L* 

for all distributions. 

Many other classifiers have been, and continue to be, developed. We argue, 
however, that for higlr-dimensional problems the choice of classifiers is not the 
most pressing problem. Rather, dimensionality reduction is the fundamental 
determining aspect of classification performance in high dimensions. 

2.4. Misclassification rate estimation. In order to assess how good a classi- 
fier is, or to compare classifiers, we would like to know the misclassification rate 
(probability of misclassification) L. Unfortunately, knowing the exact value of L 
requires knowledge of the (unknown) class-conditional distributions. Therefore, 
an important issue in pattern recognition is the estimation of the misclassification 
rate. 

One method for misclassification rate estimation is called the training/test set 
method: one selects a training set from which to build the classifier, and holds 
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out an independent test set (for which the class labels are also known) upon 
which to evaluate the classifier. This unbiased holdout estimate of classification 
performance is denoted L™ where n observations are used in training and m ob- 
servations are used in testing. Analysis is easy: toL™ is the sum of independent 
Bernoulli random variables, and hence follows a Binomial(m, L(g\T> n )) distribu- 
tion. A problem with this approach is that it requires the collection of additional 
labelled data beyond that which is used to build the classifier. Labelled data 
can be expensive, and one might want to use all the available labelled data for 
training, under the assumption that this will yield a better classifier. 

The method in which one uses all the labelled data to build the classifier and 
then uses the same data to test the classifier is called resubstitution , denoted 
L( r \ The resubstitution error rate can sometimes be useful in the analysis of 
classifiers, but obviously yields a biased (optimistic) estimate of the error. 

An improvement on the resubstitution method, with some of the flavor of the 
training/test method, is leave m-out cross-validation, denoted In this, m 

observations are withheld from a training set of size n and are subsequently used 
to test the resultant classifier. This is repeated with the next m observations, 
until all observations have been in a test set (each observation is used in only one 
test set). If to = 1, this is simply referred to as cross-validation. For a discussion 
of the relative merits of various methods for estimating misclassification rate, see 
Devroye et al. (1996) or Ripley (1996). 

2.5. Clustering. In the unsupervised case, we have available to us feature 
vectors X n := {X ^, . . . , X n }' , with no class labels available. The goal is to cluster 
these data in such a way as to provide clusters Ck C X n , k = 1, .... K which 
correspond to some (interesting? useful?) unobserved class labels. Clustering is 
obviously a more difficult problem than classification. However, clustering is a 
likely candidate for the exploitation subsystem in some ISP applications. 

Clustering can be viewed as the discovery of latent classes within the data. 
The clusters correspond to classes that were not identified by the collector of the 
data. These can represent, for example, different variants of a disease in a medi- 
cal application, previously unidentified subspecies in a biological application, or 
different types of vehicle in an image processing application. 

Unlike classification, clustering per se is not well posed. Before proceeding, 
one must define (implicitly or explicitly) a definition of cluster. Different def- 
initions lead to different clusterings, and without a priori information, there 
is little reason to select one clustering over another. Thus, clustering depends 
fundamentally on the underlying cluster model. 

A further distinction is that clustering requires a determination of the number 
of clusters. This can be done a priori, but usually it is done interactively, either 
through presentation of potential classes to the user, or through some testing 
procedure on the model. Thus, clustering combines all of the hard questions in 
statistics: model selection, model building and model assessment. 




INTEGRATED SENSING/PROCESSING FOR PATTERN RECOGNITION 231 



3. Integrated Sensing and Processing 

The smooth functioning of industry, the government, and even our individ- 
ual day-to-day activities increasingly relies on a broad spectrum of sensing sys- 
tems keeping a vigilant eye (ears, nose, etc.) on myriad complex environments 
and tasks. We are becoming accustomed to the benefits of sophisticated sens- 
ing/exploitation systems, ranging from the CT scanners and magnetic resonance 
imagers that our doctors may inflict upon us, all the way to the suite of radars, 
thermal imagers, accelerometers, gps, and chemical sensors which some modern 
cars carry. (Progress.) Moreover, vast quantities of sophisticated sensor data 
is readily obtained for perusal in the comfort of one’s home: large quantities of 
imagery from webcams, surveillance cameras, hyperspectral sensors, synthetic 
aperture radars (SAR), and X-ray astronomical data, to name only a few types, 
can all be quickly accessed on the internet. 

The growing complexity and volume of digitized sensor measurements, the 
requirements for their sophisticated real time exploitation, the limitations of hu- 
man attention, and increasing reliance on automated adaptive systems all drive 
a trend towards heavily automated computational processing of the flood of raw 
sensor data in order to refine out essential information and permit effective ex- 
ploitation. Complex computational tasks like image formation and enhancement, 
feature extraction, target detection, classification, intelligent compression, index- 
ing, and operator cueing contribute substantially to the successful operation of 
the ubiquitous sensing systems essential for our modern technological society. 

A generic sensor system may be viewed as a machine for converting informa- 
tion about an object or situation through various representations. The infor- 
mation is initially carried in physical fields (for example, light waves entering 
a camera lens), transduced into a digital representation (such as the pixels of 
a grayscale image), which may be computationally manipulated (contrast en- 
hanced for example), and, in many cases, converted to concentrated symbolic 
information (such as the identification of a particular person standing before the 
camera). A cartoon model of the generic sensor system is depicted in Figure 1 
with the feedforward flow of information from stage to stage indicated by the 
horizontal arrows. Each subsystem in the figure performs its specific transforma- 
tion of information in its turn, from physical fields to digital representation in the 
physical layer, with digital manipulations and enhancements in pre-processing, 
and finally exploitation to extract high level content. Digital processing generally 
begins on a pixel array “thrown over the fence” from the physical layer. There 
is generally little direct feedback from the processing layers to the physical layer 
that would enable a rapid adaptation of that subsystem’s behavior on the basis 
of discoveries or requirements of processing layers. In consequence, the physi- 
cal layer typically measures a rather fixed representation of the physical fields, 
and the digital processor endeavors to extract useful information out of this by 
computational processing. 
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Over the last 40 years the need for for effective computational processing and 
exploitation of digitized sensor data has been met by advances in algorithms 
from Digital Signal Processing (DSP) and statistical pattern recognition. These 
advances have combined the power of applied mathematics with the growing 
precision, stability, throughput, and easy availability of digital processors in an 
attempt to meet the growing challenges posed by modern applications. One 
big impact of these advances on sensor systems is the decoupling into the sub- 
systems described previously: physical sensor layer, digital processor layer, dig- 
ital/symbolic exploitation layer. This represents a significant transformation 
of sensor/exploitation systems from those of previous times, when exploitation 
tasks were not automated, and only rudimentary signal processing was performed 
directly on sensor measurements in the analog domain. Within the current di- 
vision of labor, analog manipulation is limited to the first stages of the physical 
sensing, whereas recent computational mathematical developments in DSP and 
pattern recognition naturally concern the digital processing and exploitation lay- 
ers almost exclusively. 

Recent DARPA sponsored reviews of trends in sensor systems have suggested 
that the growth of computational complexity in sensor systems networks is 
quickly becoming a hard limit to scale-up through the concomitant growth of 
costs of hardware and software, power consumption, and specialization. As sen- 
sor data volume and dimensionality grows, computational loads appear to be 
outstripping the steady Moore’s law growth of processor power and the sporadic 
algorithmic breakthroughs in throughput. One response to this is DARPA’s In- 
tegrated Sensing and Processing (ISP) program, which attempts to meet this 
challenge by leveraging mathematical advances across all components of a sens- 
ing system. ISP seeks examples of sensing systems for which it is possible and 
advantageous to jointly optimize traditionally the decoupled subsystems of a 
sensor system. This contrasts sharply with standard approaches which indepen- 
dently optimize subsystems such as the physical layer (sensor head), and the 
various computational processing layers. 

ISP begins with the observation that the main impact of mathematical de- 
velopments for sensor systems in recent times has been in the processing and 
exploitation layers, where the ability to computationally adapt mathematical 
representations and transformations of digital data in real time enable the dis- 
covery and exploitation of structure hidden in raw sensor output. Similar but 
largely untapped opportunities now exist in a current generation of digitally 
controllable sensor heads for a broad spectrum of phenomena, suggesting new 
capability to adaptively sense features more informative than pixels. 

To realize this capability will require effective mathematical optimizations and 
control strategies which intelligently integrate currently disjoint tasks of sensing 
and computation. This promises immediate benefit of “load balancing” between 
sensor head and processing, with lower signal processing burden while greatly 
improving the quality and information concentration of the measurements. Car- 




INTEGRATED SENSING/PROCESSING FOR PATTERN RECOGNITION 233 



rying on with this idea, ISP contemplates “back end” functions such as classifier 
algorithms playing an active role in dynamic control of their sensor inputs; in 
effect playing a mathematically optimal game of “20 questions” through tailored 
sensor queries suited to the task at hand and what is known or suspected up to 
the present time. In the new picture of a sensor system, the components have 
overlapping functionality and communicate data and control in an all-to-all load 
balanced network. 

In this paper, we demonstrate several simple “proof-of-concept” examples of 
ISP, in which the exploitation subsystem feeds back to the sensor information 
on what next to sense, based on the determination of the exploitation (classifier) 
on the current data. Thus, based on preliminary classification of what has been 
observed, the sensor changes what it is collecting and how it is processing the 
observations. Again we refer to the cartoon presented in Figure 1. Traditionally, 
a sensor collects measurements which are processed in some manner and fed to 
a classifier. The classifier renders its decision and some action is taken based 
on this decision. This traditional flow is indicated by the horizontal arrows. 
In adaptive sensors a sensor- preprocessor feedback loop may be present. In 
the full ISP scenario, the classifier also modifies the set of measurements to be 
sensed based on exploitation-level feedback. Thus, based on analysis done in the 
different subsystems, sensor adjustments are fed back to the sensor to improve 
the overall performance of the system without adversely impacting the overall 
throughput. 




Figure 1 . Integrated Sensing and Processing (ISP). The initial sensor measure- 
ments are processed in the preprocessor. This may indicate adjustments to the 
sensor (top arrow) — for example, to improve signal to noise ratio. Preliminary 
classification results at the exploitation stage suggest changes to the sensing, 
which information is also fed back to the sensor (bottom arrow). 

One analogy for the ISP is a human doctor, viewed as an adaptive sen- 
sor/exploitation system. The doctor collects preliminary information, tempera- 
ture, blood pressure, etc. Then, based on these measurements and external in- 
formation (for example, information about the outbreak of a plague), the doctor 
selects new measurements to collect in order to improve or confirm the prelim- 
inary diagnosis. This can be viewed as adjusting the sensor to collect different 
or more precise information, based on a preliminary classification from the ex- 
ploitation subsystem. Similarly, a hyperspectral sensor might adjust the spectral 
range of the sensor based on preliminary indications from the classifier of the 
potential class of the observed object. 
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Figure 2. Illustration of a hyperspectral data cube. The cube consists of spatial 
images (bands) taken at different wavelengths A. 

The ISP approach will be illustrated in the following sections with a ped- 
agogical example and two experimental applications. These illustrations will 
demonstrate that for some simple but perhaps realistic situations the ISP idea 
of utilizing information obtained in the classification subsystem to drive sensor 
parameters can improve the overall performance. 

4. Experiment: Hyperspectral Data Cube 

For this experiment we have obtained from Naval Space Command a HyMap 
hyperspectral data set — imagery of the airport at Dahlgren, Virginia (Figure 
2). The data consist of 126 images, each one representing the appearance of the 
scene in light which lies in a narrow spectral band. These bands are obtained 
throughout the visible, near infrared, and short wave infrared range. Equiva- 
lently, we can think of the data as a collection of spectra indexed by the spatial 
locations in the scene. Spectral imagery data of this sort can provide information 
about the spatial structure and chemical makeup of the objects within the scene 
of regard, and is being exploited for problems of detection and identification in 
a diversity of settings, ranging from biomedicine to defense. 

Hyperspectral data gives very fine spectral resolution, but this is not always 
an advantage. Obviously hyperspectral data is very high-dimensional compared 
to multispectral imagery, which is similar in concept but comprised fewer, coarser 
spectral bands. One must be concerned with the curse of dimensionality in the 
statistical pattern recognition tasks applied to hyperspectral data. Moreover, 
the large data sets produced by hyperspectral imagers can also lead to signifi- 
cant computational and communication challenges, particularly for time-critical 
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applications. Furthermore, the narrow spectral range of the hyperspectral bands 
mean that one must collect light for some time before obtaining enough photons 
in a given band to produce an image with reasonable signal-to-noise ratio. A 
multispectral sensor with fewer bands would offer coarser spectral resolution but 
could offer better time resolution, lower dimensional data, and less overall data 
burden than a hyperspectral sensor. A multispectral sensor with tunable bands 
could potentially offer some of the benefits of both worlds. 

To explore this possibility, we used the more than 100 bands of the HyMap 
hyperspectral data set as the basis for simulation of a two-band ISP sensor system 
in which the two are chosen adaptively. For the purposes of this experiment, 6 
bands with high noise were removed and 120 bands are used to give an indication 
of the distribution of photons over wavelength. The coarse bands of the ISP 
sensor are each the result of a Gaussian filter applied to the 120 band HyMap 
spectrum. That is, for each spatial location, a weighted sum of the the spectral 
intensities multiplied by the amplitude of a Gaussian with mean and standard 
deviation a\ is returned. Thus the sensor has four adjustable parameters: the 
spectral means and standard deviations of the Gaussian filters. 

Pixels were selected from the image and classed as corresponding to one of 7 
classes, using ground truth based on a visit to the site. The 7 classes are: runway, 
pine, oak, grass, water, brush, swamp. A training set of 700 observations (100 
from each class, selected randomly) was chosen, and the remaining (14,048) 
observations were designated a test set. 

The experiment simulates an adaptable sensor which operates as follows. Ini- 
tially the sensor collects information about the scene in two pre-specified bands 
(the factory setting), simulated by applying the two Gaussian windows to the 
HyMap data with fixed initial filter parameter settings. A classifier examines 
the two band data for each pixel and indicates its coarse classification in the 
form of the most likely (at most three) classes to which it may belong. Given 
the classes that this first classifier identifies as contenders, the sensor adjusts its 
filter parameters to collect new two band data optimized for the task of refining 
the initial classification by discriminating among the short list of candidates se- 
lected in round one. See Figure 3. Thus, the overall sensing and classification 
takes place in multiple stages with feedback to the sensor to improve the results. 
The classifiers must be trained and optimized; therefore for all stages, the train- 
ing data has been split into two equal subsets, with one set used in classifier 
construction and the other used to estimate the performance of the classifier. 
More precisely: 

Stage 1 . We employ a 7-nearest neighbor classifier as the initial coarse-grained 
classifier. For each observation presented to it, the labels of the top three most 
likely classes (of the seven defined above) are returned. The filter parameters 
defining the two bands of the sensor are selected so as to maximize the empirical 
probability that this classifier places the correct class amongst the top three. 
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These parameters, along with the 7-nearest neighbor classifier defined by the full 
training set, constitutes the initial sensor /classification system. This provides the 
“factory setting” of the system. 

Stage 2. For each of the ( 3 ) “superclasses” (combinations of 3 candidate classes), 
filter parameters are selected which optimize the classification of an observation 
drawn from this superclass, narrowing down its classification to just one of these 3 
candidates. That is, we optimize to maximize the probability that an observation 
is assigned to the correct class given the data available for the 3 class “superclass” 
identified for that observation in stage 1. The classifier applied to the sensor 
features tuned to a given superclass is a 1 -nearest neighbor classifier based on 
the training data restricted to the 3 candidate classes of that superclass. 

Again, performance is evaluated using the split training set, not the indepen- 
dent test set. The filter parameters selected for each combination of classes will 
be used to tune the sensor for the best possible discrimination when initial clas- 
sification of a test observation indicates that particular combination of classes 
constitutes the candidate set. 

Stage 3. The overall classifier is tested as follows. For each observation in 
the test set, the initial “factory setting” filter parameters are used to obtain the 
initial two sensor features. The 7-nearest neighbor classifier is evaluated on these 
initial features. Generally this will return the three leading candidate classes for 
the observation. In the event that all 7 nearest neighbors are labelled with the 
same class, unanimity is viewed as decisive and the test observation is classified 
accordingly without further ado. Otherwise, the filter parameters appropriate 
to the candidate set of classes are used to adapt the sensor and produce a new 
feature vector. This new feature vector is passed to the appropriate nearest 
neighbor classifier, which renders its decision. 

The results of this experiment indicate that this optimization which includes 
feedback from the exploitation subsystem can yield significant performance im- 
provement. The initial classifier places the true class of the test observation 
into the top three classes 94.15% of the time. This places a lower bound on the 
possible performance of the overall system at Llb = 0.0585. Using a nearest 
neighbor classifier on these features produces an error of L nn = 0.1844. (If in- 
stead of optimizing the parameters for the top-3 classifier we optimize for the 
nearest neighbor classifier we obtain an error of L optnn = 0.165.) Our two-stage 
classifier, which adjusts the sensor based on a preliminary classification as sug- 
gested by the “feedback loop” in Figure 1, has an error of L isp = 0.101. Thus 
this experiment demonstrates a significant improvement due to altering sensor 
parameters based on classification-specific feedback. Notice that we are simulat- 
ing the effect of the Gaussian filter feature extraction; if implemented in a sensor 
system, we would expect the classification performance to be even better due to 
integration gains inherent in observing the spectral features directly. 




INTEGRATED SENSING/PROCESSING FOR PATTERN RECOGNITION 



237 




Figure 3. Illustration of the hyperspectral experiment. First, the sensor collects 
the default bands (1) and a classifier determines the top three classes most likely 
to contain the true class (2). This determines the new bands to sense (3), which 
is fed back to the sensor (4). The sensor collects the appropriate bands, which 
are passed to the ultimate classifier (5). 

5. Pedagogical Example: Multivariate Time Series 

As a pedagogical example of ISP, consider a case in which each observa- 
tion consists of a multivariate time series (this sort of data is rather common). 
For each entity under investigation, the sensor is capable of observing any of 
d > 1 time series (“bands”) on a time interval [0, T] at a maximum resolution 
Tmax — that is, at equally-spaced times ti = T/r ma xM = 2 T/r max , . . . ,t rmax = 
TmaxT /r max = T. However, sensor and/or channel constraints dictate a max- 
imum throughput for each observation of t < d ■ r max . This is a reasonable 
simplified model of constraints which might imposed on a real systems by lim- 
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itations of sensor power, available communications bandwidth, computational 
power, etc. 

We want to perform feature selection based on exploitation-level considera- 
tions, but the exploitation subsystem cannot have access to all potential fea- 
tures simultaneously. We assume that the sensor/processor subsystem is ca- 
pable of adapting to subsample each band at a band-specific resolution rb < 
t max (with b £ {1, . . . , d}) — that is, at equally-spaced times 1 1 = T/rb,t 2 = 
2 T/rb, • • ■ ,t rb = T. (The direct subsampling considered here is done without 
any filtering of the continuous time input, and may introduce aliasing; we shall 
see that ISP improvement is nonetheless possible.) 

Given a training sample T> n of entities with known class labels (class-con- 
ditional training sample sizes rij for j £ {1, . . . , J} with J2j=i n i = n ) the goal 
is to optimize, based on classification performance, over the collection of band- 
specific resolutions. That is, we seek 

r* ■■= arg min L?{g\V n ) 

rGlZ T 

where L?(g[D n ) denotes the probability of misclassification for classifier g trained 
on training sample T> n which has been subsampled in accordance with resolutions 
r and, for c > 0, 

n c := \ r= G [0 ,r max } d : ^ n < c 

^ b—1 

Thus lZ r is the collection of band-specific resolutions satisfying the throughput 
constraint r. 

However, since the exploitation subsystem never sees all the dimensions si- 
multaneously, this optimization must be performed iteratively. That is, we be- 
gin with an initial sensor setting (say uniform allocation of resolution, r 1 = 
[r/d , . . . , r/d]') and obtain some measure of which bands are useful for the clas- 
sification task at hand. This information is provided to the sensor/processor 
subsystem, and the resolution is increased for the more useful bands and de- 
creased for the less useful bands. (We operate here under the guiding principle 
that higher resolution for bands with discriminatory information is likely to yield 
an improvement in classification performance. For this version of ISP to work 
as opposed to yielding random search — some such guiding principle must be 
present to allow the sensor/processor subsystem to choose which measurements 
to make based on feedback from the exploitation subsystem.) 

Let L 1 := Lfi(g\'D n ) represent the mis-classification performance using fea- 
tures at the initial choice of resolutions, r. The (penalized) feature selection in 
the first iteration, 

d 

r 1 * := arg min L?(g\V n ) + A n 
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yields performance L 1 * := L?i* (g\V„). We expect, if d is large and the number 
of bands with significant discriminatory information is small, that L 1 * < L 1 . 
This expected improvement is due to the fact that this feature selection repre- 
sents dimensionality reduction and, in high dimensions with finite training data, 
dimensionality reduction done properly can yield superior performance due to 
the curse of dimensionality. (Recall the Jain-Trunk example.) 

A simpler version of this feature selection is to perform a band-by-band anal- 
ysis to determine which bands are useful and which bands are to be discarded. 
This can be accomplished by considering the special unpenalized “all or nothing” 
choice of bands: 

f 1 * : = arg min L P (g \V n ) 

Pen' T 

with 

n'r := {f= [r 1 ,...,r d y <= {0 ,r/d} d }. 

At this stage, those bands b for which r^* = 0 are to be discarded, with the newly- 
available channel capacity to be evenly allocated among those bands which have 
been deemed useful. Thus r 1 = [rf , . . . , r^]' where 

r b = I{ r b* > 0} ’ t /Y^ 0 I{ r 0* > 0}- 

Finally, we define L 2 := Lp 2 (c/|2?„). If our guiding principle — in this case, that 
higher resolution will increase the discriminatory information in the useful bands, 
then we expect that L 2 < L 1 *. 

Of course, the probability of misclassification is not generally available for use 
in our optimization objective. Using the available training data T> n we can, for 
any given r, obtain an estimate Lp(g\'D n ) of the probability of misclassification. 
Thus we can, in principle, seek 

r* := arg min L?(g \V n ). 

r£K T 

Alternatively, some appropriate surrogate may be employed. For instance, a 
simple classifier g — a classifier for which Lp{g\T> n ) is readily available — can be 
used in the optimization. Then a more elaborate classifier g' can be used for 
the ultimate exploitation. This surrogate approach will be considered in the 
sequel. Note, however, that when exploitation means classification, as it does 
herein, appropriate surrogates will likely still require class label information and 
may need to reside at the exploitation subsystem — on the opposite side of the 
channel throughput constraint from the sensor/processor subsystem. 

We consider for illustration the case in which each class j. band b process is 
autoregressive. That is, the *-th observation Xj^j, i = 1 , . . . , rij . is given by an 
(independent) autoregressive ARj^(p) process of order p > 1; 

p 

Xj,b y i(tk') — ^ ^ (Xj,b,lXj,bj(tk— l) + 

1=1 
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for tk € {. . . , - 2T/r m ax , -T / T max i 0? T/r m ax, IT /r max, ■ ■ •}, where the e{tj,b,i,k ) 
are iid normal (0, erf). We write 6tj t b = [otj, 6 , 1 , otj,b,p{ to denote the class- 
specific, band-specific time series parameter vector. (Recall that a requirement 
for stationarity yields a constraint on ctj t b-) 

In this case, no purely signal processing considerations will allow for the de- 
termination of which bands/resolutions are to be preferred. This determination 
must be made based on feedback from the exploitation module which is in turn 
based on an analysis necessarily taking into account the class labels — classifica- 
tion performance analysis or some appropriate surrogate. 

Maximum likelihood estimates of the parameters ctj.b can be obtained based 
on observations of the training entities. These estimates are consistent and 
asymptotically normal (Anderson (1971)). Thus the training sample provides 
for an asymptotically Bayes optimal classifier. 

Furthermore, this provides for a reasonable surrogate. For each band b an hy- 
pothesis test of Hq : ai,b = a? 2,6 against the general alternative can be performed 
using Hotelling’s T 2 test statistic (Muirhead (1982)), for instance. Those bands 
for which the null hypothesis is rejected at some specified significance level are 
considered to be “useful” for discrimination. The consistency of the hypothesis 
test employed implies that, in the limit, good bands will not be discarded while 
most bands with no discriminatory information will be discarded. For instance, 
for d = 25 with exactly five of the bands useful for discrimination, testing at the 
0.05 level of significance will be expected to reject for 19 of the 20 useless bands 
while rejecting for all five of the useful bands (as the estimates oijj, approach 
their asymptotic distributions). It follows that L 1 * < L 1 for large T. 

More specifically, for the two class, two band AR(1) case (p = 1, J = 2, and 
d = 2), consider T — 1, r max = 100, and initial sensor settings of rj, = 50 for 
b = 1, 2 (r 1 = [50, 50]'). Let the class j = 1 model be specified by au,i = 01,2 = 0; 
similarly, let the class j = 2 model be specified by 0 : 2,1 = 0 and 02,2 =0.1. (For 
p = 1 we drop the superfluous lag subscript l from the parameters Oj, {,,/.) Thus 
there is no discriminatory information in band 6=1, while band b = 2 at 
the highest resolution will allow for optimal discrimination. For these AR(1) 
processes, a t-test of Hq : Oi,{, = 02,6 is an appropriate surrogate, and is here 
employed. To obtain r 1 * we optimize over 7U 100 via these t-tests, meaning that 
if exactly one band rejects the null hypothesis we completely eliminate the band 
which fails to reject and up-sample, to full resolution r max = 100, the band 
which does reject the null hypothesis. Using class-conditional training sample 
sizes rij = 10, classification performance based on these observations, as measure 
by a Monte Carlo estimate L based on 50 Monte Carlo replicates of 100 test 
samples per class per replicate, is 

L 1 = 0.2184, L 1 * = 0.2156, L 2 = 0.0426. 

Thus, as designed, the exploitation-based feedback and sensor adaptation yield 
L 2 <C L 1 . As noted above, the consistency of the hypothesis test employed in 
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this example implies that, for large enough class-conditional sample sizes, this 
empirically observed result can be proved; that is, L 2 <C L 1 . (Note that, since 
d = 2 for this case, L 1 ss L 1 * is not surprising.) 

Regarding the first feature selection, 43 times out of 50 Monte Carlo replicates 
this selection correctly chose band 6 = 2 (r 1 * = [0, 50]'). In five cases both bands 
yielded rejection in the hypothesis test, in which cases L 2 = L 1 * = L 1 . In 
one case neither band yielded rejection; again L 2 = L 1 * = L 1 . In one case 

band 6=1 only — the wrong selection! — yielded rejection; for this one replicate 
"?2 v. "fi* fi 

-'Aepl -'Aepl -'Aepl- 

6. Experiment: “Artificial Nose” Chemical Sensor 

We consider data taken from a novel chemical sensor/optical read-out system 
designed and constructed at Tufts University. The fundamental component of 
this sensor is a solvatoclrromic dye embedded in a polymer matrix White et al. 
(1996) which responds to the introduction of a chemical analyte to its environ- 
ment with a change in its fluorescence intensity. These basic devices can be 
fabricated in a number of well characterized variants, each responding in some 
way to particular chemical analytes Dickinson et al. (1996). In general, the de- 
vices are cross reactive rather than specific; that is, each will respond significantly 
to a variety of analytes, although fortunately with differences in the details of 
the response signature from one analyte to another. By analyzing the responses 
of several of these devices one may obtain a specific identification in many cases 
of interest. 

For application of these devices in a sensor system, the fluorescence signature 
must be stimulated and read-out during the exposure of a device to an analyte. 
For example, a device can be attached to an optical fiber through which laser 
illumination is provided in order to stimulate the signature fluorescence of that 
device. The resulting light signal is conducted back through the same fiber for 
read-out. Typically, an array of devices with their optical fiber readouts will be 
bundled together to make a sensor. See Priebe (2001) for a discussion of pattern 
recognition for this kind of sensor. 

The Tufts data we study in this section was obtained from a bundle of 19 
varying sensors attached to fibers. An observation is obtained by passing an 
airborne analyte (a single chemical compound or a mixture) over the fiber bundle 
in a four second pulse, or “sniff.” The information of interest is the change over 
time in emission fluorescence intensity of the dye molecules for each of the 19 
fiber-optic sensors (see Figure 4). 

Data collection consists of recording sensor responses to various analytes at 
various concentrations. Each observation is a measurement of the time varying 
fluorescence intensity at each of two wavelengths (620 nm and 680 nrn), within 
each sensor of the 19-fiber bundle. The sensor produces observations Xj ti ^(tk) 
where 6 = 1 , ,d = 38 represents the fiber-bandwidth pair (j> ■ A for fibers 
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Figure 4. The Tufts artificial nose consists of optical fibers doped with a sol- 
vatochromic dye. Reaction of the polymer matrix with an analyte produces 
photons which are sampled at two wavelengths to produce a response for each 
fiber. These photons are captured by a CCD device, resulting in a time series of 
light intensity above (or below) the background intensity. The figure illustrates 
the response of two fibers sampled at a single wavelength. 

(j) € {1, . . . , 19} and wavelengths A € {1,2}. The index i = 1, ... ,n represents 
the observation number. The class label j flags the presence or absence of a 
chemical of interest, described in more detail below. While the process is natu- 
rally described as functional with t, ranging over a 20 second interval [0, T = 20], 
the data as collected are discrete with the 20 seconds recorded at r max = 60 
equally spaced time steps tk = §§, |g, . . . , -y|p, for each response. Construction 
of the database involves taking replicate observations for the various mixtures of 
chemical analytes. 

The sensor responses are inherently aligned due to the “sniff” signifying the 
beginning of each observation. The response for each sensor for each observation 
is normalized by manipulating the individual sensor baselines. This preprocess- 
ing consists of subtracting the background sensor fluorescence (the intensity prior 
to exposure to the analyte) from each response to obtain the desired observation: 
the change in fluorescence intensity for each fiber at each wavelength. Functional 
data analysis smoothing techniques are utilized to smooth each sensor response 
Ramsay and Silverman (1997). 

The task at hand is the identification of an unlabelled odorant observation 
X . Specifically, we consider the detection of trichloroethylene (TCE) in complex 
backgrounds. (TCE, a carcinogenic industrial solvent, is of interest as the target 
due to its environmental importance as a groundwater contaminant.) 

In addition to TCE in air, eight diluting odorants are considered: BTEX (a 
mixture of benzene, toluene, ethylbenzene, and xylene), benzene, carbon tetra- 
chloride, chlorobenzene, chloroform, kerosene, octane, and Coleman fuel. Dilu- 
tion concentrations of 1:10, 1:7, 1:2, 1:1, and saturated vapor are considered. 

We consider the training database V n = [{Xi,Y\), . . . ,{X n ,Y n )}' to consist 
of 38-dimensional time series (representing odorant observations) and their as- 
sociated class labels Y t e {1,2} (TCE absent and present, respectively). The 
database T> n consists of n± observations from class 1 and 712 observations from 
class 2. Class 1, the TCE-absent class, consists of n± = 352 observations; the 
database T> n contains 32 observations of pure air and 40 observations of each of 
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the eight diluting odorants at various concentrations in air. There are likewise 
ri 2 = 760 class 2 (TCE-present) observations; 40 observations of pure TCE, 80 
observations of TCE diluted to various concentrations in air, and 80 observations 
of TCE diluted to various concentrations in each of the eight diluting odorants 
in air are available. Thus there are n = n\ + n -2 = 1112 observations in the 
training database T> n . This database is well designed to allow for investigation 
of the ability of the sensor array to identify the presence of one target analyte 
(TCE) when its presence is obscured by a complex background; this is referred 
to as the “needle in the haystack” problem. This is the database considered in 
Priebe (2001). 

As in our pedagogical autoregressive process example, we consider a through- 
put constraint. In this case, with d = 38 and r max = 60, consider a through- 
put constraint of r = 1140 < d ■ r max = 2280. Then r/d = 30. Let r 1 = 
[r/d , . . . , r/d]' = [r max / 2, . . . , r max /2]'. With this initial set up we obtain L 1 = 
0.237. (Probability of misclassification error rates here are obtained via 10-folcl 
cross-validation using the one-nearest neighbor classifier.) 

We obtain r 1 * by optimizing over 1Z' T . Actually, this still leaves 2 38 candidate 
dimensionality reductions to consider, and so we “sub-optimize” ; we calculate 
Lb(g \T>n) for each individual band b = 1 ,...,d and select the “best few”. A 
subset of 12 of the 38 bands are selected based on this criterion, and after this 
optimization we obtain L 1 * = 0.121. 

The best 12 individual bands selected for f 1 * are then upsampled, while the 
remaining 38 are downsampled. The components of r 2 are given by 

rl = I{r\* > 0} • r max + I{r\* = 0} • r max / 4. 

After optimization and feedback adjustment we obtain L 2 = 0.102. 

We have, as desired, L 2 < L 1 * < L 1 . The improvement from r 1 to r 1 * 
is dramatic, indicating that the dimensionality reduction employed — although 
simplistic — was successful. Using r 2 as opposed to r 1 * yields an improvement 
of 1.9%. The reduction in misclassification rate is from 134 misclassified to 113 
misclassified — 21 observations, or 15.7% of the previously misclassified observa- 
tions. This improvement obtained by using r 2 as opposed to r 1 * is statistically 
significant (McNemar’s test). 



7. Discussion 

We have presented examples illustrating “Integrated Sensing and Process- 
ing” (ISP) as a path towards end-to-end optimization of a sensor /processor/ 
exploitation system with respect to its performance in supervised statistical pat- 
tern recognition (classification) tasks. The approach we have studied in this 
paper takes the form of dimensionality reduction in sensor feature space coupled 
with adaptation of sensor features. These techniques are aimed explicitly at 
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improving an exploitation objective — probability of misclassification — and are 
necessarily implemented iteratively due to throughput constraints. 

We note that the results presented are quite preliminary and only begin explo- 
ration of the ISP concept. For instance, classifier adaptation and optimization 
is certainly an aim in ISP, although we have not pursued this direction in the 
present paper. Ultimately, ISP seeks to jointly optimize sensor function, digital 
preprocessing, and exploitation systems, including classifier design; however, it 
is our belief that this issue is secondary to that of dimensionality reduction for 
many high-dimensional classification applications. 

Dimensionality reduction is fundamentally important for many disparate ap- 
plications in pattern recognition as well as in other fields including control, mod- 
eling and simulation, operations research, and visualization. The topic is the 
subject of intense research in these various communities, and now becomes a 
fundamental enabling technology for the new discipline of ISP. In this paper we 
have considered only very simple dimensionality reduction methodologies, which 
just begin to indicate the possibilities and implications for integrating sensing 
and processing. Nevertheless, we feel that the results of these first experiments 
indicate significant promise for this line of inquiry. 

A critically important aspect of the dimensionality reduction strategies con- 
sidered in this paper is the identification of some guiding principle or heuristic 
for guiding the sensor/processor subsystem in its choices of which measurements 
to make based on dimensionality-reduction feedback from the exploitation sub- 
system. The choice of such a principle is a sensor- and application-specific task. 
For many multivariate time series scenarios “higher resolution in useful bands” 
approach taken in this paper seems to be a reasonable principle. This might be 
extended to include variable resolution in quantization, or in spatial sampling 
in other sensors. Finding appropriate guiding principle (s) for various important 
cases of practical interest may perhaps represent the single most important as- 
pect of developing a workable ISP methodology. 
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Abstract. In this paper we investigate quadrature rules for functions on 
compact Lie groups and sections of homogeneous vector bundles associated 
with these groups. First a general notion of band-limitedness is introduced 
which generalizes the usual notion on the torus or translation groups. We 
develop a sampling theorem that allows exact computation of the Fourier 
expansion of a band-limited function or section from sample values and 
quantifies the error in the expansion when the function or section is not 
band-limited. We then construct specific finitely supported distributions on 
the classical groups which have nice error properties and can also be used 
to develop efficient algorithms for the computation of Fourier transforms 
on these groups. 
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1. Introduction 

The Fourier transform of a function on a compact Lie group computes the 
coefficients (Fourier coefficients) that enable its expression as a linear combina- 
tion of the matrix elements from a complete set of irreducible representations 
of the group. In the case of abelian groups, especially the circle and its lower 
dimensional products (tori) this is precisely the expansion of a function on these 
domains in terms of complex exponentials. This representation is at the heart 
of classical signal and image processing (see [25; 26], for example). 

The successes of abelian Fourier analysis are many, ranging from national 
defense to personal entertainment, from medicine to finance. The record of 
achievements is so impressive that it has perhaps sometimes led scientists astray, 
seducing them to look for ways to use these tools in situations where they are 
less than appropriate: for example, pretending that a sphere is a torus so as 
to avoid the use of spherical harmonics in favor of Fourier series — a favored 
mathematical hammer casting the multitudinous problems of science as a box of 
nails. 

There is now however in the applied and engineering communities, a growing 
awareness, appreciation, and acceptance of the use of the techniques of non- 
abelian Fourier analysis. A favorite example is the use of spherical harmonics 
for problems with spherical symmetry. While this is of course classical mathe- 
matical technology (see [2; 23], for example), it is only fairly recently that serious 
attention has been paid to the algorithmic and computational questions that arise 
in looking for efficient and effective means for their computation [4; 8; 22], Re- 
cent applications include the new analysis of the cosmic microwave background 
(CMB) data — in this setting, the highest order Fourier coefficients of the func- 
tion that measures the CMB in all directions from a central point are expected 
to reveal clues to understanding events in the first moments following the Big 
Bang [24; 32]. Other examples include the use of spherical harmonic transforms 
in estimation and control problems on group manifolds [18; 19], and for the so- 
lution of nonlinear partial differential equations on the sphere, such as the PDEs 
of climate modeling [1] . The closely related problem of computing Fourier trans- 
forms on the Lie group SO (3) is receiving increased attention for its applicability 
in volumetric shape matching [13; 14; 17]. 

In order to bring these new transforms to bear on applications, we must 
bring the well-studied analytic theory of the representations of compact groups 
(see [33], for instance) into the realm of the computer. Generally speaking, 
implementation requires that two problems need to be addressed. On the one 
hand we need to find a reduction of the a priori continuous data to a finite set 
of samples of the function, and possibly of its derivatives as well, and we must 
solve the concomitant problem of function reconstruction, which may only be 
approximate, from this finite set of samples. This is the sampling problem. On 
the other hand, efficient and reliable algorithms are required in order to turn the 
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discrete data into the Fourier coefficients. These sorts of algorithms go by the 
name of Fast Fourier Transforms or FFTs. 

In the abelian case the theory and practice are by now well-known. Shannon 
sampling is the terminology often used to encompass the solution of the sam- 
pling problem for functions on the line, or -and more relevant to this paper — 
the problem of sampling for a function on the circle, while the associated FFT 
provides tremendous efficiencies in computation. 

In this paper we focus on the sampling problem for compact Lie groups, 
through an investigation of quadrature rules on these groups. Following the 
well-known abelian case we distinguish between two situations: the band-limited 
case in which the function in question is known to have only a finite number 
of nonzero Fourier coefficients, and the non-band-limited case. In the former 
situation it is possible to exactly reconstruct the function from a finite collection 
of samples, while in the latter, the best we can hope for is an approximation to the 
Fourier expansion, as well as some measure of how close is this approximation. 

We first describe a general setting, a filtered algebra, where an extension of 
the classical notion of band-limited, as in [28], makes sense, and adapt it to 
the special case of functions on a compact Lie group, G. We define a space of 
functions A s on G, the band-limited functions with band-limit s, in such a way 
that A s .At is contained in A a +t- Then we develop a sampling theorem of the 
following form: 

Assume <p is a distribution on G and / is a continuous function on G that 
is sufficiently differentiable for the product f.ip to exist. There is a canonical 
projection, P s , from the space of distributions onto A s . We describe norms, || ||, 

! ||*, || ||** such that 

||P s (/.(<£- / u))|| < M(s,t)\\ip — /x||*||(l — -P s )/||**, 

provided that P s+t {ip — p) = 0, where /i is Haar measure of unit mass on the 
group and M (s, t) is a function which we explicitly bound in the case of the 
classical groups. 

When / is band-limited this gives a condition on the distribution used to 
sample / that allows exact computation of the Fourier transform of / from the 
sampled function. When / is not band- limited it quantifies the error introduced 
when using the Fourier expansion of f.ip to approximate that of /. In particular 
we show that for sufficiently differentiable functions the projection of the approx- 
imate expansion onto a space of band-limited functions closely approximates the 
projection of the original function onto this space without requiring significantly 
more sample values than the dimension of the band-limited space. The amount 
of oversampling is related to the growth function of the algebra generated by 
the matrix coefficients, and hence to its Gel’fand- -Kirillov dimension. This is the 
content of Section 2. 

In Section 3 we extend these results to the expansion of sections of homoge- 
neous vector bundles in terms of basis sections coming from the decomposition of 
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the corresponding induced representation, e.g. the expansion of a tensor field on 
the sphere in tensor spherical harmonics [16]. Finally in Section 4 we construct 
finitely supported distributions on the classical groups which are convolutions 
of distributions supported on one parameter subgroups and which have all the 
properties required by the sampling theorem, i.e. P s+t (ip — A 4 ) = 0 and ||y> — fi ||* 
is bounded. These distributions can be used to develop fast algorithms for the 
computation of Fourier transforms on these groups. A general algebraic ap- 
proach for such algorithms, which uses efficient algorithms for computing with 
orthogonal polynomial systems [5], is presented in [21]. 

Remark. This paper only considers the compact case, but the non-compact 
is at least as interesting. In this setting G. Chirikjian has pioneered the use of 
representation theoretic techniques for a broad range of interesting applications 
including robotics, image processing, and computational chemistry [3]. 



2. Sampling of Functions 

Before going into the general situation it is instructive to consider the familiar 
case of functions on the 2-splrere S 2 , identified with the subalgebra of functions 
on the compact Lie group SO(3) that right-invariant with respect to transla- 
tion by SO (2), the subgroup of rotations that leave fixed the North Pole. See 
Section 2.2.1 for notation. 

Example: The Fourier transform on S 2 . Let Yj m , with |m| < l, denote the 
spherical harmonic on S 2 of order l and degree m (see [23] for explicit definitions) . 
Any continuous function, /, on S 2 has an expansion in spherical harmonics 
E/ m a imYim which converges under suitable conditions on /, e.g., when / is C 2 . 
The coefficients a; m are called the Fourier coefficients of the function f. 

Assume s is a nonnegative integer; then / is said to be band-limited with 
band-limit s if all the coefficients ai m in the expansion of / are zero for l > s, 
i-e. if / = Em <i<s a imYi m - If we now pick N = (s + l) 2 points x \, . . . , Xn on 
S 2 in general position, then the function values of / at these points completely 
determine / provided / is band-limited with band-limit s, so the linear map 
from function values to coefficients (aim)\m\<i< s is a vector space 

isomorphism. The numbers ai m can be found from the function / using the 
formula ai m = f s2 f.Yi m dfj , , where p is the invariant measure on the sphere of 
unit mass. We can also find these numbers by inverting the equations f{xj) = 
E|m|<j<« a imYim(xi). Another method would be calculate the integrals using 
sums of the form 



N 

^ ' f {Xi)Yiy n {x-^jW'i^ 

i - 1 
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where the vj t are numbers, called sample weights, depending only on the points 
Xi- This is only possible, however, if the Wi and the Xi satisfy 

N 

y ^Yim{xi)wj = ^( 0 , 0 ), (f, m ) for \m\ < l < s, 

i = 1 

which is not usually possible for general sets of TV = (s + l) 2 points, but is 
possible for general sets of N = (2s + l) 2 points; the condition then determines 
the sample weights, w t . This is precisely the condition that we can integrate 
exactly any band-limited function of band-limit 2s using the points and weights, 
and it follows from the fact that the product of two band-limited functions of 
band-limit s has band-limit 2s. 

What about functions that may not be band-limited? To treat this more 
general case we first rewrite this discussion. Let A s denote the space of band- 
limited functions with band-limit s, let ip a = y WjS T ,. be a finitely supported 
measure on S' 2 , and let bi m = f s2 f.Yi m dip s be the Fourier coefficients of the 
finite measure f.ip a . If / is in A s and (ip s — n,A 2 ) = 0, then a; m = bi m for 
|m| < l < s; to obtain the condition above note that A 2 S = A 2a . If / is not in 
A s , then we can not assume that we will have a; m = bi m for l < s, but we can 
bound the error. It follows from the example immediately after Theorem 3.7 
that, provided (ip a — jj. A 2s ) = 0, we have 
s / i \ 1/2 / N \ / i \l/2 

^(2Z + 1)( ^2 (bi m — aim) 2 j < 2(s+l) 4 f ^2 w i J 5Z(2^ + 1) f ^2 a fm) 

1=0 \ m =-l ' V i=l / 1 >S \ m =-l ' 

Let P s denote the projection from the space of distributions C°(S 2 )' onto A s 
given by truncation of the expansion in spherical harmonics, then we can rewrite 
the above inequality to obtain 

II P,(M<Ps - mc 0 < II Ps(f.(<p, - M))IU„ < 2(s + l) 4 ||^||c' 11(1 - Ps)f\\A 0 

<K\\cp s \\ c ,\\(l-P s )f\\ W6 , 

where || ||^ 0 is the norm of absolute summability inherited from that on SO(3), 

| \\w 6 is the Sobolev norm on C 6 , and K is a positive constant; the last inequality 
follows from an application of Bernstein’s theorem on SO(3) (see [6; 27]). Hence, 
of / is in C 6 , and ip s is a sequence of measures on S 2 which converges weak-* to 
/.i and for which (ip s ,A 2s ) = 0, then \\P s (f.(ip s — /x))||c 0 tends to zero as s tends 
tends to infinity. 

This approach to the construction of quadrature rules for functions on S 2 , 
can be generalized, and is the goal of the remainder of this section, which is 
divided into two parts. First we generalize the band-limited sampling of the 
introduction to filtered algebras and outline an approach for dealing with func- 
tions which are not band-limited. Next we treat the case of continuous functions 
on a compact Lie group, G. Any such function, /, has a Fourier expansion in 
terms of the matrix coefficients of irreducible unitary representations of G. The 
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Fourier transform of / is the collection of all coefficients in this expansion, and 
may be represented as an element of the space JT End V 1 , where 7 ranges over 
the irreducible unitary representations of G, and V 1 is the space on which this 
representation acts. Sampling a C m function, /, corresponds to multiplying it by 
a distribution, p, of order at most m. By putting norms on the space n. End V 1 
we can, under suitable assumptions on p, bound the difference between a finite 
number of the Fourier coefficients of / and f.tp. 

In what follows we assume a familiarity with the basic ideas and tools of the 
representation theory of compact groups. There are many excellent resources for 
this material. Standard texts include [33; 29]. 

2.1. An Abstract Framework. Several of the results of this paper fit into a 
simple framework. Assume A is a complex algebra and {A s } is a set of subspaces 
of A such that A s .At C A s _|_t, where s and t range over some semigroup, which 
we shall take to be the non-negative integers or reals. Let A! denote the dual of 
A, and define a A-module structure of A! by 

(a-p)(g) = <p(g-a) 

for any a, g in A, and ip in A'. Let P s denote the projection from A' onto A' s 
given by restriction of linear functionals. Then we have the following trivial 
result. 

Lemma 2 . 1 . Assume p, p are linear functionals in A' such that P s +t.(p — p) = 0. 
Then 

Psif.p) = Ps(fp) 

for any f in At.. 

This lemma simply states that, if the linear functionals, p and p, agree on the 
subspace A s+t , then they also agree on the subspace A s .A t . 

Example. Assume A is a finitely generated C-algebra with identity, and let S 
be a finite generating set containing the identity. Define So = C.l, and let Sk 
denote the span of all products of k elements of S. Then Sfc.Sj = Sk+i for any 
nonnegative integers k and l. 

The lemma above does not necessarily hold for elements, /, which do not belong 
to A t .. To deal with this case, let us introduce norms on the algebra, A. Assume 
that || ||yi' is a norm on A' s and that || ||a, || ||s are norms on A. Let A a be 
the continuous dual of A with respect to || ||a, let || ||^ denote the dual norm, 
and let A B be the completion of A with respect to || ||b. Now define 

M(s,t) = sup{||P s (/i.<p)||./^ : \\h\\ B = 1, IMIa = 1, he A, p £ A', P s +tP = 0}. 

When there is a possibility of confusion, we shall denote this Mg s ’ A (s,t)- If 
M(s,t) < 00 then P s {h.p) is well defined whenever p is in the A-continuous 
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dual of A, Ps+tA = 0, and h is in the _B-completion of A. In addition, it only 
depends on the coset of h modulo A t . 

Lemma 2.2. Assume <p, p are linear functionals in A a ' such that P s +t{p — p) = 
0, and let h £ A B . Then 

II Ps{f-<p) - Ps(f-v)\\A' a < M(s,t) \\ip - mIIaH/IIb/a* 
where || ||s/^i t denotes the quotient seminorm on A 3 /A t . 

The next section of this paper is concerned with bounding M(s,t) in the case 
where A is the algebra spanned by the matrix coefficients of finite dimensional 
representations of a compact Lie group. We shall also bound the quantity 

M(s,t) = sup{||e./i|| J 4 Ms+t : ||e|U s = 1, \\h\\ B = 1, e £ A s , h £ A} 

for some particular choices of norms || ||^ a on A s . If A s is finite dimensional 
and || \\a' s is dual to || ||yi s , then we have M(s,t ) < M(s,t). Weakening || \\a or 
| \\a' ■> or strengthening || ||b or || ||^ a will decrease M(s,t ) and M(s,t). 

When the algebra A has a symmetric bilinear form ( , ) such that (a\, 02 . 03 ) = 
( 01 . 02 , 03 ), then we have an .A-module morphism from A into A'. Thus we can 
translate Lemma 2.1 into a statement about subspaces of A. 

Lemma 2.3. (i) A 3 +t .A s C A 

(ii) Let A~ = L>t< s At, then Af+ t .A s C Af 3 . 

Proof. Part (ii) holds because A s .Af C A~ +t . □ 

2.2. Sampling of Functions on a Compact Lie Group 

2.2.1. Notation and conventions. In what follows, we’ll assume G is a connected 
compact Lie group, with Lie algebra 0 . Let T be a maximal torus of G and t be 
it’s Lie algebra, then f) = t c is a Cartan subalgebra of g . Choose a fundamental 
Weyl chamber and for any dominant integral weight, A, let A a be the irreducible 
Lie algebra representation of highest weight A. If G denotes the unitary dual of 
G, then the map sending an irreducible unitary representation, p, to it’s highest 
weight allows us to identify G with a subset of the set the set of all dominant 
integral weights. For any A in G denote the group representation of highest 
weight A by Aa as well, and set d\ = dim A a = n«eA+ ((A + <5, a) / ( S , a)) where 
5=5 ]C a eA+ a an d ( , ) is the Killing form, and A + is the set of positive roots. 
Let r = dim([G, G] (~1 T ) be the semisimple rank of G, l be the dimension of the 
center of G, and k be the number of positive roots of G. Then 2fc + r + Z = dim G, 
and d\ is a polynomial of degree k on f)*. For any representation, p , of G, let p v 
be the representation dual to p. This gives an involution, ( ) v on G. 

Choose a norm on g. For any nonnegative integer, m, define a norm on C m {G), 
by \\f\\c m = sup{||L(Ai . ..XpjfWoo : 0 < p < m, X 1 ,. . X p £ g, ||Xl|| = . . . = 
||X p || = 1}, where L is the left regular representation. Denote the dual norm 
on C m {G)' , by || ||c m '- These norms are all invariant under the right regular 
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representation. If we were to replace the left regular representation by the right 
regular representation in the above definitions, we would get an equivalent set of 
norms invariant under the left regular representation. For 0 < m < oo, denote 
bilinear pairing between C m (G)' and G m (G) by ( , ). For ip in C m (G)', and 
g,h in C m (G), we have (ip,g.h) = (ip.g,h). Define an involution on C°°(G) by 
f(x) = /(a; -1 ), and anti-involutions by f(x) = f(x), f*(x) = These 

extend to involutions and anti-involutions on C°°(G)' by setting (T, /) = (T, /), 
(T, /) = (T, /), and T* = T, for any T € C°°(G)' and / e G°°(G). If pc denotes 
Haar measure on G of unit mass, then the map / i— > f-PG gives us an inclusion 
L 1 ( G ) C G°(G)', and since G is compact, we also have inclusions L P (G) D L q {G) 
for 1 < p < q < oo. Denote the L p norm on L P (G) by || || p . 

Let A denote the span of all matrix coefficients of finite dimensional unitary 
representations of G. Then A is a subalgebra of G°°(G) under pointwise multi- 
plication of functions. A is invariant under the involutions, and the pairing 

( , ) restricts to a nondegenerate bilinear form on A. The hermitian form (/, g) 
is positive definite so the bilinear form is nondegenerate on any subspace of A 
closed under “. In particular, if A s = A s then we can use the bilinear form to 
identify A’ s with A s . We shall use J_ to refer to orthogonal complements taken 
with respect to the bilinear form. For a subspace closed under “ this is the same 
as the complement taken with respect to the hermitian form. For any A 6 G, let 
A\ be the span of the matrix coefficients of A^. The Schur relations show easily 
that Aj^ = 

2.2.2. The Fourier transform. Let y(6') = IIasc? ^ ■ where V\ is the Hilbert 

space on which acts. Choose a norm on f)*. For 1 < q < oo and 0 < m < oo, 
define on ^(G) the following norms, which may possibly be infinite: 

^ AeG ' 

Ibboo = sup{||Aa||oo : A e G}, 

lblU m = ||A 0 ||i >0 + 53 dA||A|r||A A || liA , 

ASG\{0} 

\\ A \\A’ m = sup{|| A|| m |bA||oo,A : A e G, A y 0} U {||A)||oo,o}) 

where || Hoo^ is the operator norm on End V\ relative to the Hilbert space 
norm on V\, and for 1 < q < oo, || || 9jA is the norm on EndIA given by 
Pa|| 9 ,a = Let $,(&), A m (G) and A' m (G) be the cor- 

responding subspaces of 5(G) on which these norms are finite. For general 
properties of norms of these types see [11]. 

Recall that if H is a complex Hilbert space, and A is a linear operator on 
H , then A* is a linear operator on H , and A 4 is a linear operator on its dual 
space, H', as is A = A* 4 . Hence we can define an involution on 5(G), by 
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(A ( ) a = (A a v)*, for A G 5(G), A G G, and anti-involutions, (A*) a = (A*)*, 
(-A)a = A x v. 

We shall now assume that the norm on f)* satisfies ||A V || = ||A|| for any A € G. 
Then the maps ( )*, ( )*, and ( ) preserve all the above norms on 5(G). Define 
a bilinear pairing between A' m and A m , by (A', A) = Tr((A ,4 ) A A A ). 

The map T : A' m (G) — > A m (G)' given by Ta'(A) = (A', A) is an isometric 
isomorphism, and so we shall use this map to identify A m (G)' and A' m (G) from 
now on. 

Define the Fourier transform to be the map 5 '■ C°°(G)' — > 5(G), given by 
(ip,5(s)\v) = (s,x i— > (ip, A\(x)v)) for any tp £ and v G V\. When / is 
a function in L l (G) this becomes (5f)x = J G f( x )A\(x)dpG(x)- To make the 
statement of the next lemma simpler, it is convenient to assume choose the 
norms on h* and 0 so that ||A a (X)|| 00)A < ||A||.||X||; to see that this is possible, 
just consider the case where the norm on 0 is Ad-invariant. This condition 
can always be achieved by scaling either the norm on f}* or the norm on 0. 
More specifically, this condition avoids additional constants in the statements of 
Lemma 2.4((d),(f)). 

Lemma 2.4 (Properties of 5 )• Assume m is a nonnegative integer, 1 < q < 2 
and 1/q + l/q' = 1. 

(i) 5 : C°°(G)' — > 5(G) is one to one. 

(ii) 1 1 5/ 1 1 q/ < \\f\\q. These are the Hausdorff- Young inequalities. 

(iii) 5(L q ' (G)) D 5 q (G), and for any A in 5 q (G) we have ||5 _1 (A)|| g / < ||A|| 9 

(iv) 5(C m (G)) D A m (G), and for any A in A m (G) we have ||5' _1 A||c , m < 

' PIU m . 

(v) Assume T G C m (G)' , A G A m (G), and f = 5 1 A. Then ( T,f ) = (• 5T,5f )• 

(vi) For any s in C m (G)' we have ||l?s|U' m < ||s||c m ' 

(vii) For any s in C°°(G)' we have 5(s) = 5s, 5s = ( 5s)* , and 5s* = (Js)*. 
In particular, 5 is real relative to the real structures on C°°(G) 1 and 5(G) 
induced by the anti-involutions, (), on these spaces. 

(viii) (l?(si * S 2 )) A = (3Ai) a (5s2)\, for any distributions, Si,S 2 , in C°°(G) 1 , 
and any A in G, where si * S2 denotes the convolution of the distributions s i 
and S 2 - 

(ix) \\5(S! * s 2 )m^ i+m2 < II^SlIU^ ||?s 2 \\A' m2 - 

Proof. See [20; 11]. □ 

The image 5A consists of precisely those elements, A, of 5(G) such that A\ = 0 
except for finitely many A. All the norms defined above are finite on 5 A, and 
5A is dense in each of these spaces under the corresponding norm. As 5 is one 
to one, we can transfer the algebra structure on A to 5A , and hence obtain a 
M-module structure on the spaces A m and A' m . The map T, is an isomorphism 
of of A- modules, and we can use same formula to get a dual pairing between 
5(G) and 5A, and hence a A-module isomorphism between (5 A)' and 5(G). 
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2 . 2 . 3 . Simple bounds for M (s,t). Let us assume that an increasing set of finite di- 
mensional subspaces {.A s }, is given, that A s .A t C A s+t , and that UsXj^s = 
Examples for such subspaces can be obtained from finite dimensional generating 
sets of A, or as described in Section 2.2.4, from a norm on f)*. We shall bound 
M(s,t ) for several different choices of norms, || ||a, || ||s, || ||yi s , on A and A s . 

Using the Leibniz rule one sees that for f,g € C m (G), we have ||/g||c m < 
2 m ||/Hc m ||ff||c m . Therefore 

Result. Assume the A, B norms are both || \\c m and that || ||>i s is the restriction 
of || || Cm to A s . Then M(s,t ) < 2 m 

When in = 0, this tells us that if ip is a regular bounded complex Borel measure 
on G satisfying P s +t( <p — go) — 0, h is a continuous function on G, and Y = 
(g e- > (A \(g)u,v) is a matrix coefficient in A s , then | J G h.Ydip — J G h.Ydga | < 
ll«llll«lllbllo i ||ft||coM t . Clearly \\h\\ Co /A t tends to zero as t tends to infinity. 

In a similar fashion, we can bound M(s,t) for weaker choices of the norm 
II I Us on A s . 

Result. Assume the A , B norms are both || \\c m and that || ||yi s is the restriction 
II lie' to A s , then for some K >0, independent of m, 

M(s,t)<K m ( 1+ £ 4|| A|| 

^ A s nAx^(p 

Consider this for s = t. Assume that tp is a distribution of order m on G 
satisfying P 2s (y> — go) = 0, and h is a C m function of G. Then 

\\P s (h.ip — h.g)\\c 0 < 2 m (l + J2 d l\\M\ m )McJh\\ Cs/As , 

^ A s nA x ^((> 

but the sum in this bound is bounded from below by a constant times s 2k + m + r + l , 
and we are forced to consider higher differentiability conditions on h in order to 
get convergence of \\P s (h.ip — h.g)\\c 0 to zero. Doing so leads us naturally to 
the consider the norms A m , on A , and more careful arguments with these new 
norms will give us more refined bounds on M(s,t) in the situation above. 

2 . 2 . 4 . Norms on G. Let || || be a norm on f)*. For any s > 0 let A s be 
the span of all the matrix coefficients of representations for ||A|| < s, i.e. 
A s = £||a||< 5 ^a. There are several properties we may require of this norm on 
t)*. We say that a norm || || on l)* has property I if whenever A, /x, v are in G, 
and is a summand of A^ <g) A M , then ||u|| < ||A|| + ||u||. We say that || || has 
property II if \\n'\\ < ||A|| whenever v' is a weight of A a- 

Lemma 2.5. || || has property I if and only if for any s,t > 0, A s .A t C A s+t 

Lemma 2.6. (i) If || || satisfies property I, and A^ is a summand of Ax <S> A^, 
then | ||A|| - \\g\\\ < \\v\\. 
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(ii) || || has property I if and only if |||A|| — |MI| < ||/z|| whenever A „ is a summand 
of A x ® A m . 

Proof. Part (ii) is a direct consequence of (i). To prove (i), assume I, and 
suppose A v is a summand of A a (g) A M . Then A,, C For any s > 0, 

let A~ = Eiihk^p- Then A a C Ajj^, and A^ < A w . Assume ||A|| > ||/z||. 
Lemma 2.3 shows that AiE.A^ C Hence A v C Ipq) _ 1 1 /x 1 1 ’ anc ^ so 

l|A||-NI<IM|. ' □ 

To show that II implies I, we need the following lemma. 

Lemma 2.7. Assume A, / 1 v are dominant integral weights. If A v is a summand 
of A a (g) Ap , then u = p + 1 / where v' is a weight of A\ 

PROOF. Follows from Steinberg’s formula for the decomposition of tensor prod- 
ucts. See [12] □ 

Corollary 2.8. II implies I 

All the norms on ()* which we will use, will satisfy property I. Let us now show 
that norms satisfying properties I or II really do exist. 

Assume ( , ) is a positive definite Ad-invariant inner product on g 1L \ Then 
define ||^||Ad = \J (/b /x). This gives a norm on 1)* which is invariant under the 
Weyl group. 

For calculations involving the classical groups another set of norms is more 
convenient. Assume G is a simple classical group and let Ai, . . . , A r be the fun- 
damental dominant weights with the standard labeling (i.e. that which appears 
in [12, p. 58]). Define the linear functional, H , on f)* by requiring that for 
H = E fljAj, we have 

(i) H{p) = Ei=i a i when G is SU(r + 1) or Sp(r). 

(ii) H{fi) = Ei=i a i + \ a r when G is SO(2r + 1). 

(iii) H{p) = Ei=i a i + | (dr - 1 + a-r) when G is SO(2?’). 

Define a norm || ||^ on f)* by requiring that \\p\\h = H{p) for any dominant 
weight and || ||^f is invariant under the Weyl group. Note that in each of the 
above cases || \\h is also invariant under V. 

To verify that we indeed have defined norms it is easiest to use a different 
description. Let {e.J denote the usual basis of C r . When G is SU(r + 1) we 
have an isomorphism between f)* and <C r+1 / (ei + . . . e r+ i = 0). such that A , = 
E (— 1 e*. When G is any other simple classical group we have an isomorphism 
between i}* and C r with Aj = Ej=i for 1 < i < r — 2, and A r _i = ei + - • - + e r _i, 
A r = ei + - • • e r for Sp(r), A r _i = e\ + - ■ -+e r -i, X r = |(ei + . • • e r ) for SO(2?’+l), 
and A r _i = ^(ei + • • • + e r _i — e r ), A r = \{e\ + ■ ■ ■ + e r ) for SO(2r). When G 
is Sp(r), SO(2r+ 1) or SO(2r), the norm || ||# corresponds to the sup norm on 
C r . When G is SU(r + 1) it corresponds to twice the quotient of the sup norm 
on C r+1 . 
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Lemma 2.9. (i) If g is abelian, then any norm on t)* has property II. 

(ii) Assume || ||i, || H 2 are norms on 0 i and 02 which both satisfy the same 
property I or II. Assume 0 = 0 i ® 02 , and ||Ai + A 2 || = || Ai || 1 + || A 2 1 | 2 for any 
Ai £ hi and A 2 G t) 2 - Then || || satisfies the corresponding property I or II on 

r = w ®t)2- 

(iii) || || Ad has property II for any 0 . 

(iv) || ||// has property II for any of the simple classical groups. 

PROOF. Parts (i) and (ii) are trivial. For (iii), note that 0 = 3 ® [0, 0 ] is 
an orthogonal direct sum, so we need only prove the result in the case where 
G is semisimple and ( , ) on it is simply the Killing form. So let’s assume 
that this is the case, A € G, and p is a weight of A. Since all elements of 
the Weyl group are isometries, we may also assume that p is dominant. Then 
(A, A) — (/x, n) = (A + /i, A — fj) , which is greater than 0 because A + ft is a 
dominant weight and A — //.is in the positive root lattice. 

Part (iv) is equivalent to the condition that H(a) > 0 for any simple root a. 
This is easily checked by inspection of the Cartan matrices of the simple classical 
lie algebras. □ 

There is a nice interpretation of A s in the case where G is SU(r + 1), Sp(r) or 
SO(2r+l), and II 11 = 11 Ik- In this case, A\ is the span of the matrix coefficients 
of the representations with highest weight a fundamental analytically integral 
dominant weight (i.e. an element of a basis for the analytically integral dominant 
weight over the nonnegative integers) or 0. Hence A-\ is a finite dimensional 
generating set for A, and for any positive integer s, A s is the span of all products 
of up to s elements of At . In particular, A s .At = -A s+t . 

2.2.5. Further bounds for M(s,t). We shall now bound M(s,t), as defined in 
Section 2.1, where II IU = II IU m , II Ik = II iu P . It is clear that the pairing 
between A' m and A rn allows us to identify $A' S with and that A rn and A' m 
are dual norms on this finite dimensional subspace. In the definition of AI (s, t ) we 
shall use || |k = || \\ A ’ mi , || | U' = II |U mi - The projection, P s , from P' = $(G) 
onto J.A S is given by ( P S A)\ = 0 when ||A|| > s, and ( P S A)\ = A\ when ||A|| < s. 
The quotient norm on A p {G)/$A t is clearly given by \\f\\ Ap /ZA t = \\f ~ p tf\\ Ap - 
Hence 

M(s,t) = sup{||P 5 (/i.^)|U mi -h,ip G \\h\\ Ap = 1, IklU^ = 1 ,Ps+tA = 0}, 
M(s,t) = sup{||e./i - P 8 +t(e.h) \\ Am : \\h\\ Ap = 1, ||e||A^ i = l,e e M}- 

The bounds for M(s,t) depend on the following lemma. 

Lemma 2.10. Assume f,g are in Aq(G). Then f.g is well-defined, and 

H/kUo < 11/IUollslUo- 



Proof. See [11]. 



□ 
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Theorem 2.11. Assume the norm on t)* satisfies property I. Then there is a 
I\ > 0 such that for any non negative integers, p > m > 0, and any s,t > 1, we 
have 

M(s,t) < K G s 2k+2r+l+mi (s + t) m t- p . 

Proof. Assume that e G &A S , h € &A are such that ||/i||a p = 1, and ||e||^ = 1. 
For any A in G, let P\ denote the projection from $(G) onto the subspace 
corresponding to End Pa- Let e v = P v e , h\ = P\h, and let II (i/) denote the set 
of weights of A„. 

Then 

\\e-h\U m /SA s+t < E d,Mr\\P,(e.h)\\ h „ 

il/< ;>» i t 

< E d M m E \\P,(G-hx)h,, 

J|(u||>s+t lkll<sA-M6n(i/) 

||aiHM||<IH 

< e rf>ax{ijHr}EiH™iMi,A> 

IMI< S m A 



where we used the inequality 

\\Pn(eu-h x )\\i,n < d~ 1 d x d u \\hx\\i,x\\ei J ,\\i, tl , < dxd 2 v \\h x \\ 1 , A II II 00,1 / 1 

which follows directly from Lemma 2.10. Now sum on fj, lemma to see that for 
some K > 0, the above quantities are bounded by 

E ^max{l,|Mr}|n(p)| E rfA(||A||+ S riMi,A 
IMI<» II A||>t 

< E 4 ma X {i ) || l /ir i }|n( I /)|(i| I /|i+trt- p E ^iiaihmia 

Nl< s l|A||>t 

<(s + trs m '( E dl\n(v)\) E dA||A|HMi,A 

NMI<* ' l|A||>i 

<Ks 2k+2r+l+m '(s + t) m t- p E ^aIIAIHI^aIU.a- 

II A||>t 

The last inequality holds because there is a constant C > 0 such that |n(z/)| < 
Cj|i'|| r . This holds for the norm || || Ad and hence for any other norm on 1)*. □ 

When G is abelian we can get a more explicit bound for even more general norms 
on $A. We shall bound M(s,t ) for slightly more general choices of || ! Ai || ||b 
and || \\a s than we used above. We have d\ = 1, so each End Pa is naturally 
and uniquely isomorphic to C. Define norms, on $A, for 1 < q < oo and 
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—oo <in< oo, by 



1/9 



\\Ah q A m =l\A 0 \ q + E (l|A|r I^aI)' 

^ AeG\{o} 

PlbooA m = sup{||A|r |Aa| : A G G, X ± 0} U {|A 0 |}. 



If l/q + 1/q' = 1, then || ||$ ,A_ m is the dual norm to || || 5 9 A m , and when both 
norms are restricted to A a , this holds for q = oo as well. When m = 0 we have 
II ll®,A 0 = II and when q is 1 or oo, m > 0, , we have || ||j 1 A m = || |U m and 
II booA_ m = || |U'. Now let || || a' be the restriction of || ||g ?1 A mi to Ms, let 
II IU = II b„ 2 A m2 and || || B = || ||g, 3 A m3 - 

Theorem 2.12. Assume G is abelian, 1 < qi,q 2 ,q 3 < oo, and s and t are 
positive integers. Then 



M(s, t) < 



1+ E (IMP) 91 

IMI<« 



1/91 

(. s + t) m2 t ~ m 3 



provided q 3 < 52 and m3 > m-2- 



PROOF. Similar to 2.11, except in this case, start with h,(p in $A and expand 
out the product h.ip directly. □ 

2.2.6. Examples: Sampling for S 1 , SO (3), and the simple classical Lie groups 

The Simplest Example: Sampling on S 1 . Assume m is a nonnegative integer, 
/ is a C m complex function on S 1 , tp is a distribution of order at most m 
on S 1 , and /, ip and f.<p have the Fourier expansions ]P fe Cfc 2 : fe , Ylk m k xk and 
J^ k b k x k respectively. Then \\$f \\ g = (Efc \ c k\ 9 ) 1/q , ||37|U m = Efe fcI ”l c fc| and 
||MIU^ = sup{fc -m |mfc| : k e Z}. Hence 

/ \!/9 

( El^-^l 9 ) < (2s+l) 1/9 (l+^r/V E k m \c k \ 

' |fc|<s \k\>t 

/ \ 1/2 

<(2s+l) 1 /«(l+ 7 ) m N^=( El fcm+1 ^| 2 • 

^\k\>t ' 



provided m.k = 0 for 0 < |fc| < s + 1 and mo = 1, and where N = sup{fc -m |mfc| : 
\k\ > s + t}. The factor tt/\/3 could be replaced by a factor of the form Cb~ e 
for any e strictly less than When / is C m+1 we can further bound this sum 
by a Sobolev norm, as 





d m+i 

dO m+1 



(f-P t f)(e i0 ) 



2 




1/2 



Setting m = 0 and q = 00 in the above gives us the results of the introduction. 
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Example: Sampling on SO(3). For this example we take G = SO(3). Then the 
dual G can be identified with the set of nonnegative integers. The dimension 
function is d\ = 2A + 1, the rank is r = 1, there is only one positive root, and 
the dimension of the center of SO(3) is zero. Then following the proofs above we 
find that when the A and B norms are || || A m , || |U P , P> m, and || ||./i o = || ||^ , 
we have 

M(s,t)< (j2(2v + lA (l + S .) m t m - p 
V=o / 

/ c\ 171 

< (s+l) 2 (l + 4s + 2s 2 )(^l + -J t m ~ p . 

Example: The classical simple Lie groups. Assume G is a classical simple com- 
pact Lie group. Let the norm on f)* be || ||#, let the A , B , and A s norms be 
j || A m , || |U„ and II |U', where p > m. Let A R be the root lattice, and let B s 
denote the closed ball of radius s for || || R . Then the proofs above, together with 
property II, show that 

M(s,t) < (s + t) m t~ p ^2 dl\(v + A r ) nB||„|| H |, 

\W\\H<a 

where the sum is over analytically integral dominant weights. We can bound 
| (; v + A r ) n B\\v\\ h | for such v as follows. 

(i) G = SU(r + 1): |(iz + A^) n B \\ v | < (s + r + l) r . 

(ii) G = Sp(r): \(v + A R ) (T j < 2 r_1 (s + l) r . 

(iii) G = SO(2r + 1): \{v + A fl ) n B Mh \ = (2s + 1)L 

(iv) G = SO(2r): |(i/ + A R ) n B H „ \ < 2(s + l) 2 (2s + l) r ~ 2 . 

We can use these bounds and the Weyl dimension formula to obtain explicit 
bounds on M(s,t ). 



(i) G = SU(r + 1): 

M(s,t) < 

(ii) G = Sp(r): 

M(s, t) < 



(r + 3).r!n[=i*! 



_( s + i rt -p( s+ _ + 



1 



r 5\ r+3r 
3 + 2, 



5r 7\ 2r+2 r 



( r + 1)- ]X=i(2* — i) 

(iii) G = SO(2r + 1): 

M(s,t)< 1 






5r 25\ 2r+2r 



(r+l)!nU(2<-l) 

(iv) G = SO(2r + 1): 



p 2-’« r -i (s + tn -p( s+ g + |) 



— 1 2 / K r \ 2r + 2r 

M(s,t)< , 2 r +2r - 2 (s + t) m t- p s+ — + 1 , for r > 3. 

^ 1 ~ r.r\U r ~}(2i)P y J \ 12 J 



r-rlUZm'- 
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2.2.7. Differentiability and Sampling. We shall now see how the differentiability 
of the function being sampled plays a role. Define A m (G) to be the set of all 
continuous functions, /, on G, such that is in A m (G). Define || \\A m on 
A m (G) by ||/|| J 4 m = ||5/|U m - Then we have the following result. 

Lemma 2.13. Assume p is a nonnegative real number and m is a positive integer, 
and let Xi, . . . , X n be a basis for the complexified Lie algebra. g L of the connected 
simple Lie group G. Then 

A p + m {G) = {/ G C p+m (G ) : L(X h . ..X im )f G A P (G) for all 1 < i u . . . , i, < n } 
and the following norms on A p+m are equivalent 

(i) ll/IU p+m - 

(ii) max{||L(Xj 1 . . .X i:j )f\\ Ap : 0 < j < m, and 1 < i\ . . . ,ij < n} 

(iii) max{||I/(Yi . . . Yj)f\\ Ap :0<j<m, Y 1 , . . . ,Yj G g c , ||Yi|| = . . . = ||K,-|| = l}. 

In addition, this holds when G is an arbitrary compact connected Lie group and 
m is even. 

Proof. See [20]. □ 

Lemma 2.14. Assume G is a compact group of dimension n and that m > n/2. 
Then C m (G) C Aq(G), and this inclusion is continuous relative to the Sobolev 
norm on C m {G) given by 

\\f\\w m = sup{||L(Yi . ..Yj)f\\ 2 : 0 < j < m, Y ± , ...,Yj G fl C , ||^:|| = 1} 
and the norm || ||a 0 on Aq(G). 

Proof. The space C m (G) is continuously included in the Besov space A'l'^iG), 
which in turn is continuously included in Ao(G). For definitions and proof, see 
[27] and [6]. □ 

Now we can use the bounds we have been obtaining to find convergence condi- 
tions on a sequence of measures ip a and differentiability conditions on a function 
/, that ensure that ||S r P s (/ — /.<p)||c m tends to zero. 

Corollary 2.15. Assume that G is a n-dimensional compact connected Lie 
group, m,m\,p are nonnegative integers, and tp a is a sequence of distributions 
in C m {G)' converging weak-* to Haar measure and satisfying P 2 S {t — 1) =0. 
Assume f is a function on G. 

(i) If f is in G|- 3n / 2 ] +r+m+TOl+p+ i, then s p \\3P s (f - f.ip s ) \\ Ami tends to zero as 
s tends to infinity. 

(ii) If f is in C r 3 n / 2 i+r+m+mi+p an d either G is simple or n + m + mi + r + p 
is even, then s p ||3P s (/ — f-Ts)\\ Arni tends to zero as s tends to infinity. 
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PROOF. For clarity, let’s just prove the case where mi = p = 0, and G is 
simple. Assume that / is in C r 3 n / 2 ]+ r and p s is a sequence of measures in C m^, 
converging weak-* to Haar measure and satisfying P 2 s( < / 3 s — 1) = 0. 

Then || || ^ 4 / is bounded by a constant times ||<^ s ||cv i is bounded, and / is 
in A n+r+m (G). Hence ||v?«||a' ||/||A n+t . +m /yi s converges to zero. However, our 
bounds for M(s,s ) show that 

WdPsif ~ M\\a 0 < K2 m s n+r+m s~ < ' n+r+m ^ ||</? s || ||/IU„ +r+m/ yi s • □ 

3. Sampling of Sections 

It is an easy matter to generalize the above results and obtain a sampling 
theorem for sections of homogeneous vector bundles. As the theory here fol- 
lows directly from the sampling theory for groups, I have not been as complete. 
Assume K is a compact subgroup of the compact Lie group G, r is a finite dimen- 
sional unitary representation of K on Eq, and E = G x T E 0 . Then then we can 
multiply a C m section of if by a distribution on G/K to obtain a “distributional 
section” of E, which we will think of as a sampled version of the original section. 
If we project a sampling distribution on G to a distribution on G/K, then we 
obtain an appropriate sampling distribution on G/K. For harmonic analysis on 
homogeneous vector bundles over G/K, where G is compact, see [31]. 

3.1. Abstract Sampling for Modules. We shall now generalize the situation 
of Section 2.1. Let A be a complex algebra. For simplicity we shall assume that 
A is commutative. Assume that M, N are A-modules and that we have a A- 
bilinear pairing, ( , ) between them. Then for any h in M, and p in A! , we can 
define p.h in Jf' = Home (IN 7 ; C), by 

(p.h)(e) = <p((e,h)). 

Let A s , {M s }, {N s } be sets of subspaces of A, M, and IN', such that (N s ,Mt) < 
A s+t . We set P s to be the projection from A' onto A' s or from A 7 onto N' given 
by restriction of linear functionals. 

Lemma 3.1. Assume p, p are linear functionals in A' such that P s +t.(p — p) = 0. 
Then 

P s {p.h) = P s (p.h) 

for any h in M* 

Example. Assume M is a finitely generated A-module, A is a finite dimensional 
generating set for M, and A s .A t . C A s+t . Let N = Hom>i(M; A), and define 

M s = A S .X, 

Ns = {/eN:/(A)CAs}. 

Then (3\Ts,M t ) C A s+t . 




264 



DAVID KEITH MASLEN 



We now return to the general situation. Let || ||a , || ||b , || ||w„ , and || ||^ 
be norms on IN', M, IM S and respectively, and denote their dual norms with a 
prime. Then we can define 

N(s,t) = sup{||P s (/i.(^)||^ : ||^||b = 1, M' a = l,/i S M,ip £ A',P a+t <p = 0} 

~N' A 

When there is a possibility of confusion, we shall write N g s ’ . 

Let JA B ' denote that continuous dual of M with respect to || j|s, and IN' 5 be 
the completion of || ||s with respect to || ||s. 

Lemma 3.2. Assume <p, /.t are linear functionals inA A ' such that P s +t{ l P~ fi) = 0 
and h £ M 5 . Then 

\\P s (f-v) ~ Ps{fp)\\w s < N(s,t)\\ip- mII'aII/IIb/m*, 

where || || b /J vc t denotes the quotient seminorm on M 5 /M t . 

3.2. Harmonic Analysis of Vector- Valued Functions. Assume Eq is a 
finite dimensional complex vector space with norm || ||s 0 . Let G m (G;Po) be 
the space of C m functions on G with values in Eq, and when m is a nonneg- 
ative integer, define ||/||c' m ;B 0 = sup{||P(Xi . . . X p )f(x) || So ■ x £ G, 0 < p < 
m, Xi . . ,X p £ g, ||Ai|| = ... = \\X p \\ = 1}. All norms, || || Eo , on E 0 will 
give an equivalent norms || || c m -,E 0 on C m (G; E 0 ). Let || || (c m -E 0 y be the dual 
norm to || \\c m -,E 0 , and || ||(c™ ; b*)' be the norm on G m (G;P5) / , when Eq is 
given the norm dual to that on Eq. The space C°°{G-,Eq)' is the space of all 
distributions on G with values in Eq, and G m (G; Eq) 1 is the space of all such 
distributions of order at most m. We can embed C°(G\Eq) continuously into 
G°(G; Eq)' by means of the map / i— > i-i G .f, where for any h in G°(G; Eq), we 
have (ii G .f,h) = (n G , (x (h(x), f(x)))) = J G {h(x), f(x))dfj, G (x), and fi G is 
Haar measure on G. 

Let §(G;E 0 ) = II 7 6G (End(Vy) ® Eo), and define the Fourier transform, 
from G°°(G; Eq)' into 3(G; E 0 ), by 

(X (g) e*, ($s) y ) = ( s , (x i ^ (X, A 7 (x)) e*)) 

for any 7 in G, X in End(VI y )*, e* in Eq, and s in C°°{G\Eq)'. For a contin- 
uous function, /, on G with values in Eq, this becomes (£/) = f G A 7 (x) g) 

f(x)dju G (x). 

We shall define norms on $(G; Eq) which generalize the norms || ||A m we had 
when Eq was C . Given two finite dimensional complex vector spaces, V and W, 
and norms || ||y on V and || \\w on W , define the tensor product of these norms, 

I ||v®W) to be the operator norm on V ® W = Homc(F*; W) relative to the 
dual norm || \\y on V*, and the norm || \\w on W. For any 7 in G let || ||i, 7 ; b 0 
denote the norm on End(Fy) g> Eq, which is the tensor product of the norms 
j || i >7 and || ||s 0 - Define a norm || ||A m ;B 0 , which is possibly infinite on 5(G; Eq), 
by miU m ;B 0 = ||A)||i, 0 ;E o + Z^AeG.A^o ^||A|| m || A a ||i )A; b 0 - Let A m (G-,E 0 ) be 




SAMPLING OF FUNCTIONS AND SECTIONS FOR COMPACT GROUPS 265 



the subspace of ^(G; E 0 ) on which this norm is finite. This space is the space 
of absolutely summable Fourier transforms of distributions on G with values in 
E 0 whose first m derivatives also have absolutely summable transforms. The 
map, ^ is one to one, and it’s inverse gives a continuous from A m {G\Ef)) into 
C m (G;E 0 ). 

Now, let JVC = A®Eq, N = A® Eq. These naturally embed in G°°(G; Eq) and 
G°°(G; Eq), and the spaces 3M, are the subspaces of £(G; E 0 ) and J(G; Eq) 
of elements with only finitely many components. Hence we can use $ to shift 
any norm on over to JVC. Let M s = A s ® Eq , and N s = A s ® Eq. There is 
a natural A-bilinear pairing between JVC and N. Composing this form with Haar 
measure gives a C-bilinear pairing between JVC S and JNC S , which we shall use to 
identify N' with JVC S . 

For calculation of N(s,t ), it is more convenient to use the norm || || A m ®B 0 
defined on A. m (G) ® E 0 , by 



PIU m ®B 0 = sup{|| (e* 0 ,A} Am{6) \\ Am : ||ej5|| E * = 0}, 

where ( , ) A ^ is the natural A m (G)-bilinear pairing between Eq and || ||A m ®£ 0 - 
It is easy to show that A m (G) ® Eq naturally embeds in A m (G\ E 0 ). In fact, 
these two spaces are equal, as the following lemma will show. First, some termi- 
nology. We say that E 0 has dual bases of unit vectors if there is a basis of 
unit vectors in Eq, with a dual basis {u*} of Eq consisting of unit vectors. This 
happens, for example, when || ||b 0 is a Hilbert space norm, or a p-norm in some 
basis. 



Lemma 3.3. (i) || |U m ®B 0 < || |U m ;E 0 . 

(ii) If Eq has dual bases of unit vectors, then || || A m \E 0 < (dim£'o)|| II A m ®B 0 - 

(iii) || ||A m ;B 0 and || ||A m ®B 0 are equivalent norms. 

Define M(s,t) using the A mi , A m , A p norms, as we did in Section 2.2.5. We 
shall now relate this function to the function N(s, t) for various choices of the 
norms on = JVC S , A , and JVC. 

Theorem 3.4. (i) If N(s,t) is defined using the A mi ®Eq, A m , A p ®Eq norms 
on N' s , A and JVC, then 



K;£°' Am (sA)<K:^ Am {s,t). 



(ii) If N(s,t) is defined using the (A mi ;E 0 ), A m , (A p - E 0 ) norms onH' s , A and 
JVC, then for some C > 0, 



a r(A mi ;Eq),A- 
iV (A p ;B 0 ) 



(s,t) < C. (dim E 0 )M a ™ i A 



( s,t ). 



When Eq has dual bases of unit vectors, we may take C = 1 in the above 
inequality. 
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PROOF. Assume that is in A , h is in M, and is in Eq. 

II {eo,P a (<p.h)) A \\ Ami = \\P s (<P-{eo,h) A )\\A mi 

< M(s,t)\\tp\\ A > m \\ {e* 0 ,h) A \\ Ap 

< M(s,t)\\ip\\ A ' m \\eQ\\ ES \\h\\ Aplg) E 0 - 

This proves (i). The second part is an easy corollary of the first. □ 

The proof of the first part of this theorem did not involve many special properties 
of the norms A m ; the basic properties used are that is dense in the A p (G)®E 0 
and giA is dense in A m (G)'. 

Another approach to bounding N(s,t) uses an analog of Lemma 2.10 to cal- 
culate the bound directly. In some circumstances (e.g. when G is abelian), this 
gives better results than the combination of the previous theorem and the bounds 
for M(s,t). In particular, we do not use the assumption that E 0 has dual bases 
of unit vectors. 

Lemma 3.5. Assume f is a continuous complex function on G, g is in C°(G; Eq), 
and € Ao(G), and 'Sg € Aq(G\Eq). Then 

\\${f-g)\\Ao-,E 0 < (dim£; 0 )||5'/|Uol|)?5'IU o ;Eo- 

Proof. This has essentially the same proof as for the case when Eq is simply 
the complex numbers, as given in [11]. □ 

Lemma 3.5 implies that if f\ is in the A-isotypic subspace of C°°(G), g M is in 
the /i-isotypic subspace of C°°(G\ E 0 ), under the left regular actions, and v is in 
G, then 



U(f\-9n)h,^Eo < (dimE 0 )d„ 1 dxd f _ l \\$f x \\i,\\\3g IJ ,\\i, ll -,E 0 - 



When Eq = C, this inequality our main ingredient in the bound on M(s,t)- 
The generalization gives us bounds on N(s,t). The second half of the following 
theorem concerns the case when G is abelian. When G is abelian, define norms 
on for 1 < q < oo and — oo < m < oo by 



II^IM 

IWIsocA. 



(Voi 9 + E (imi^ik) 9 ) 

^ AeG\{0} ' 

sup{||A|r||A A || Bo : A G G, A^0}u{jA o |}. 



Theorem 3.6. (i) Assume G is nonabelian, the norm on t)* has property I, and 
N(s,t) is defined using the (A TOl ; E<f), A m , (A p ; Eo) norms on 3sf' , A and M. 
Then for some Kq depending only on G and the norm on ()* , 



N (a™e < (dirnE 0 )s r+l+mi+1 (s + t) 2k+r+m - 1 t- p . 
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(ii) Assume G is abelian, 1 < <7i , <? 2 5 (?3 < oo, and s and t are positive integers. 
Then we have 



N. 



Q2^ rrL 2 l / 






(M)< 1+ E 



limit?! 



l/9l 



{s + t^t- 



provided q3 < q-2 and m3 > m2- 
PROOF. The key observation in the proof of (i) is that 



\\ p s{h.y)\\A mi -E 0 

< E^a + iMr) E c^iiMiriK^AiiiA^ii^iu^, 

lkll<« ]MI>»+t, II -Ml >t, 

lllHHIMII^IMls ir^ivv-irX 

where 7 r is the natural projection from f)* onto the dual of the center of 0 . Now 
sum over p and then v. 

The proof of (ii) is essentially the same as for Theorem 2.12. □ 

3.3. Homogeneous Vector Bundles. Assume E = Gx t Eq is a homogeneous 
vector bundle, where t is a unitary representation of K. E has a G-invariant 
unitary structure determined by the inner product on Eq. Let T m (E) denote the 
space of C m sections of E with the norm ||s||r m = sup{||L(Ai . . . A p )s(a;)|| x : 
x € G/K, 0 < p < m, X\ ...I p £ g}, where || || x denotes the norm on the 
fiber, E x , determined by the unitary structure of E. If 5(G/K) is the density 
bundle and pg/k is the invariant density of unit mass on G/K , we obtain a 
map T°(E) — > T°(E ® 8{G/K)) » T°(L1*) / ; / 1 — > f-PG/Ki allowing us to identify 

T(E) with a subspace of T°(E*)' . Thus we think of T 00 ^*)' as the space of all 
distributions, or generalized sections, of E. 

There is a representation ip T of K by isometries on each of the spaces G m (G; Eq) 
and G m (G;_Eo)', defined by ip T (k)f(x) = r{k)f{x.k ), on elements of C(G;E 0 ), 
and which commutes with the left regular action of G on these spaces. The corre- 
sponding spaces of invariant functions or distributions are denoted, G m (G; r) and 
G m/ (G; r). We then have an isometry 1 j T : C m ' (G; r) — > T m {E*Y which restricts 
to an isometry between G m (G;r) and T m (A). Thus questions about spaces of 
sections of E can be simply reduced to ones concerning i/v-invariant vector val- 
ued functions on G. In particular, the multiplication map C m (G / K)' xT m (E) — > 
T(E*)' corresponds to the map C m (G) ,K x G m (G; r) — > G m, (G; r) which is the 
restriction of the scalar multiplication map for distributions on G with functions 
in G m (G; Eq). 



1 The space C m, (G;r) of invariant vectors in C m (G; Eg)' is isometric, via the restriction 
map, to the space C m (G; T v ) / . This is because the canonical projection from C m (G; Eg)' onto 
C m, (G;r) is the transpose of the projection from C m (G; Eg) onto C m (G;r v ), and this last 
projection is also a contraction. 
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As in Section 3.2 we set M = A® A 0 and N = A 0 Ag. Let M, K and A be the 
subspaces of ip T -, Vv v -i and A"- invariant vectors in M, INT, and A. Let M s , N s , 
A s , be the intersections of the spaces above with M s , N s and A s respectively. 
Finally, we can use j T and j T v to obtain corresponding subspaces, M, N, A, M s , 
K s , A s in r°°(A), r°°(A*) and C°°(G/K). 

Choosing norms on = M s , A, and M, allows us to define a function N(s, t) 
as in Section 3.1. If we assume that is invariant under the projection from Jsf 
onto N, then the dual of this projection is an injection from 3\T(. into IM' , and we 
may restrict the norm on N' to iNfj,; in fact, the C-bilinear pairing between 3\f s 
and is nondegenerate in this case. If we also restrict the norms on A and M 
to A, and M, then we can define another function N(s, t) using these restricted 
norms. 

Theorem 3.7. Assume that all the subspaces A s and the norm on A are all 
invariant under the right regular action of K. Then N(s,t) < N(s,t) 

PROOF. First note that under these hypotheses, the subspaces M s , N s are 
invariant under the representations ip T , and i/yv, and so the projections onto 
these spaces commute with the projections from M, and IN' onto M and Isf. 
Hence the definition of N makes sense. The projection from A onto A, P K , is a 
contraction with respect to || ||.a, and its dual, P K * , is an isometric embedding of 
the continuous dual of A with respect to the restricted norm into the continuous 
dual of A with its norm. P K , which is given by integration over AT, commutes 
with the projections, from A onto A s , and hence for any ip in the continuous 
dual of A such that P s <p = 0, we also have P s {P K *p>) = 0. This allows us to 
imbed the calculation of N(s, t) into a calculation involving only the spaces A, 
M, A and the subspaces N s , M s , and A s , where it is obvious that < N. □ 

We shall now define the Fourier transform map for spaces of sections of E. The 
representation, if T , of K on the y-isotypic subspace of C°°{G\ E 0 ) corresponds, 
under the Fourier transform $, to the representation Id £g) (g> r, on End(F T ) (g> 
E 0 = Vy (g) Vf (g> E 0 . The subspace of invariant vectors of this space is naturally 
isomorphic to <g) Homjf (V 1 \ Eq). So the natural space in which to define 
the Fourier transform of a section of E is j?(A) = ri 7 eG ^7 < 8 > Hom^(F T ; E 0 ). 
Define norms || || A m on j?(A) by restricting the norms || ||^ m ;_E 0 on $(G\E 0 ), 
and let A m (E) denote the subspace of j?(A) on which the corresponding norm is 
finite. Let P T denote both the projection from C°° (G\ Eff)' onto the ^-invariant 
subspace, C' oo/ (G;r) and also the projection from $(G; E 0 ) onto 5(A). Define 
the Fourier Transform map 5 : r°°(A*) — > 5(A) so that P T 5 = 5A T , then 5 
maps r m (A) into A m (A). When r is the trivial representation, the dual space 
to A m (E) corresponds to the space of invariant distributions on G for which A ' m , 
the dual norm previously, is finite. We then have that ||5 ^||a^ < IMI(C™)' f° r 
any complex distribution, ip, on G / K. Also note that if <p is a distribution on G 
satisfying P s ip = 0, then P K ip satisfies the same equation in C 00 (G/A') , . 
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Example: Functions on S' 2 . Consider the case where G = SO(3), K = SO(2), 
and t is the trivial representation of SO(2). Then E = S 2 x C is the trivial 
bundle over S 2 , and sections of E may be identified with complex functions on 
S 2 . Identify the dual of SO (3) with the set of nonnegative integers. For any l > 0 
we have dim Hom SO ( 2 ) (U; C) = 1. Choose a A) 7 (SO(2))-invariant unit vector, u* 
in V '* for each l. Then the map v i— > u(8m* gives an isomorphism between Vi and 
Vi <8> Hom SO ( 2 )(K; C). The space V) <g>Hom SO ( 2 )(Uz; C) is naturally isomorphic to 
the subspace of End V) = V) (g> Vf invariant under Id® A) 7 . The composition of 
these two isomorphisms is map, v i— > A v , from Vj into End(Vj) which is defined 
by A v w = Ui(w)v for any w GVi. Assume v is any vector in V[. We shall now 
find ||A. u || 9i /. Let Pr.„ be the self-adjoint projection onto the linear span of v, 
then A V A* = ||v|| 2 Pr„, where ||i>|| is the Hilbert space norm, so 

ll^lki = (Tr (A v A* v ) q / 2 ) 1/q = (Tr(||i;r Pr„))^ = || V || 

Using the isomorphisms above, we can identify 3(E) with II ; >o Vi, and if y G 
3(E), then ||y||^ m = ^ ;>0 (2/ + 1) max{l, i m }||y;||. One can now use the bounds 
as follows. Assume f is a C m function on S 2 with 3f = y , and ip is a distribution 
of order at most m on S 2 satisfying P s+t (ip — 1) = 0. Let 3(<P-f) = z, then for 
any positive integers s, t , and any p > m, 

S 

^i + mm-ztW 

<(S + 1) 2 (1 + 4 S + 2s 2 ) (l + s t ) m t m -v\\3(<P - 1)IU;, + m\yi\\ 

l>t 

and ||£0 - 1)\\ AL = sup{/” m ||(5's)z|| :l> s + t}. 

Example: Line bundles over S 2 . For this example take G = SO(3), K = SO(2), 
and let r = p n be the representation of SO (2) with weight n, where n is a 
nonzero integer. Then E is a line bundle over S 2 . The space Hom g0 ( 2 )(Uz; Pn) 
has dimension 1 for l > n and is zero-dimensional when 0 < l < |n|. When 
l > \n\ we may choose a unit vector, u> ; *, in the p n -isotypic space of V) and 
obtain an isomorphism, v i— > v ® Wi, between V) and Homg 0 ( 2 ) (V); p n ). As 
before, this allows us to identify 3(E) with riz>|n| an( l f° r an y V e 3(E) we 
have ||y||A m = J2i> | n |(2i + l)Z m ||yz||. To state the sampling theorem for this 
situation, assume f is a C m section of E with 3f = y, and ip is a distribution of 
order at most m on S 2 satisfying P 2 b(s — 1) = 0. Let 3(p-f) = z, and assume 
s, t are positive integers, and p > m, then 

S 

Y'W+mm-ziW 

l—\n\ m 

<(s+i) 2 (i+4 S +2s 2 )(i+ s -yt m -r\\3(v-mA' m E w+m\yi\\- 

i>t+l,|n| 
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4. Construction of Sampling Distributions 

4.1. The General Construction. Now we will outline a method for con- 
structing distributions whose Fourier transform vanishes at a given finite set of 
irreducible representations. These distributions will be finitely supported, have 
any specified order, and will be of the form x = V’l * ' ' ' * V’n? where n = dimG 
and each of the ipi are supported on a finite subset of a 1-parameter subgroup 
of G. In addition ipi, . . . ,ip n may be chosen so that \ has bounded A m norm as 
the set of irreducible representations at which its Fourier transform must van- 
ish increases. These properties have been chosen as they are required for the 
development of efficient algorithms for the computation of the Fourier trans- 
form of functions sampled on the support of these distributions, as in [21]. The 
thesis [20] contains a description of these algorithms for functions sampled on 
the support of the projection of these distributions to the homogeneous spaces 
SO(n)/SO(n — 1) and SU(n)/ SU(n — 1); they are generalizations of the algo- 
rithm for computing expansions in spherical harmonics developed by Driscoll 
and Healy in [4], Here is the general construction. 

Assume G is a connected compact Lie group, and K is a connected compact 
subgroup of G. The Fourier transforms of a distribution, ip £ C°°(K )' , and its 
image iip in C°°(G)' are simply related; if p is a representation of G, then p(iip) = 
(p\K)(ip). So the relation between the two Fourier transforms is determined by 
the way that representations of G split on restriction to K. 

For any set, Ho of irreducible representations of G, define a two-sided ideal in 
C°°{G)' by 

£fi„ = {/ € G°°(G)' : W- € Ho iP(f) = 0}. 

We wish to show how for any finite set of representations, fi 0 , we can construct 
a finitely supported distribution, y, on G, such that % — 1 € ‘Xq 0 . It obviously 
suffices to consider the case when G is simple and simply connected, the abelian 
case being trivial. Let us also restrict ourselves to the case when G has a rank 
one homogeneous space, G/K\ this only leaves a few exceptional groups out of 
our reach. 

By induction we can assume that the problem has been solved for K\ this is 
because K is a quotient of a product of abelian groups and semisimple groups 
which themselves have rank 1 homogeneous spaces. Now let Hi be the set of 
all irreducible representations of K that are contained in the restriction of some 
representation in fi 0 to K. This set is finite, and Tq 0 C i(Tn J. 

By induction, we can find a finitely supported distribution, %, on K such that 
X - 1 a G “In, • Let Xk = i{x), then xk = c K (mod Tn 0 ), where c K is the 
characteristic distribution of the submanifold, A', of G. By polar decomposition, 
G = KAK , where A is a 1 parameter subgroup of G. The idea is to choose a 
finitely supported distribution, ip, with support in A, and then let y = Xk *ip * 
Xk- Then, % = ck *ip * Ck = K P K ip (mod Tn 0 ), where K P K is the projection 
onto bi-invariant distributions. K P K ip has an expansion in terms of spherical 
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functions. The polar decomposition allows us to establish an isomorphism of 
[—1,1] with K\G/K via the obvious composition of maps [—1,1] — > A — > G — > 
K\G/K. So we can lift K P K ip up to a finitely supported distribution on [—1, 1], 
where its spherical function expansion corresponds to an expansion in Jacobi 
polynomials of some sort. By the Chebyshev property of orthogonal polynomials 
[20, Lemma 3.2], we can choose ip so that the expansion of K P K ip— 1 in spherical 
functions only contains spherical functions corresponding to representations that 
are not in Qq. That is, choose ip so that K P K c = 1 (mod Xq 0 )- Then y — 1 G 

£n 0 - 

An apparent problem with this method, is that the number of distributions 
in the convolution product for \ is too large. We desire exactly dim G of these 
factors, but the method above yields 1 factor for S 1 , 3 for SU(2), 4 for S(U 2 x Ui), 
9 for SU(3), and 2 fc +2 fc_1 — 3 for SU(fc), and dimSU(fc) = k 2 — 1. In the examples 
that follow, we use relations between the ipi modulo Tn 0 to reduce the number 
of factors to dim G, when G is one of the classical groups. 

4.1.1. Quadrature Rules. Assume that {<p m ) is a sequence of orthonormal poly- 
nomials relative to the positive measure w(x)dx on [a,c\. Then a finitely sup- 
ported distribution satisfying {ip, <p m ) = J 0m for 0 < m < n is equivalent to a 
quadrature formula that exactly integrates polynomials of degree at most n with 
respect to w{x)dx. In the case where ip is a measure supported at the roots 
of ip n , this determines the usual Gaussian integration formula, which has the 
advantages that ip is positive and {ip, ip m ) Som for 0 < m < 2n + 1. Similarly, by 
choosing the support of ip to be the roots of the n-tlr /-orthogonal polynomial 
we may find a distribution of order 2 l, supported on these points, such that 
{ip, ip m ) = Som for 0 < m < {21 + 2 )n. For more on this, see [7]. 

When ip is a positive measure, satisfying the above conditions, the total vari- 
ation norm of ip must be 1. If this measure is pushed onto a Lie group, then 
the resulting positive measure also has total variation norm 1, and a convolution 
of such measures has total variation norm 1. The construction above (and in 
the following examples) can therefore be required to produce measures of total 
variation 1 on the classical groups. When ip is supported at the points cos{nl/ri), 
0 < l < n, the total variation norm of ip tends to 1 as n tends to infinity, provided 
that w is a nonnegative L 1 function on [—1, 1], and 0 < w{ cos 9)d6 < oo (See 

[ 20 ]). 

Together with Lemma 2.4 this shows that the distribution \ of the subsection 
above can be constructed so it is bounded in the A rn norm as the set Oq varies 
over finite subsets of G. To get an explicit formula for \ we need to know how 
to convolve point distributions on G; this is explained in [20]. 

4.2. Example: Sampling on SO(n). The arguments of Section 4.1, when 
applied to the chain of groups 



SO(n) 2 SO(?r — 1) D • • • D SO(2) 
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lead to a sampling distribution on SO(n) that is closely related to the param- 
etrization of that group, by means of Euler angles. Let 



,(*) = 



cos 9 sin 6 

- sin 6 cos 9 



1 ) 



where the “rotation block” appears in columns and rows to — 1 and to. Note 
that r m r n for |n — m\ > 1 and SO(n) = SO(n — l).r„([0, 7r] ) . SO(n — 1). The 
highest weight of a representation of SO(2r + 1) is determined by its coordinates 
• ■ • ,TO r> 2 r+i relative to the basis {e^ } described in Section 2.2.4. These 
numbers range over all sets of integers satisfying 



m l,2r+l ^ ^ nz rj 2 r+l > 0. 

The highest weight of a representation of SO(2r), may also be expressed in the 
coordinates of Section 2.2.4, and these coordinates are integers, mi^ri ■ ■ • , m r , 2 ri 
satisfying 

TOi,2r > • • • > \m ri 2 r \ ■ 

The “betweenness” relations for the restriction of representations of SO(2r + 1) 
to SO(2r) and SO(2r) to SO(2r — 1) are then 



TO l,2r+l > TOp2r ^ ^2,2r+l ^ ^ TO r> 2r+l > |r?V,2r| 



and 

IBi^r > TOi t 2r— 1 ^ TO 2,2r > • • • > fn r —\,2i — 1 ^ |TO rj 2r| , 

where the rriij are either all integral or all half integral. For convenience, we’ll 
assume that n is either 2k + 1 or 2k, that the numbers mi^, . . . mk, n satisfy the 
appropriate restrictions, and that n > 2 in what follows. 

Choose a positive integer, s. We shall construct a distribution, c n on SO(n), 
such that c„ — 1 vanishes on representations with \\X\\h < s. In terms of the 
coordinates to*,, this is the same as requiring that m.i, n < s. 

The map [0, 7r] * — > SO(n— l)\SO(n)/ SO(n— 1) : 9 i— >■ SO(n— l)r n (9) SO(n— 1) 
is a homeomorphism, and its restriction to (0,7r) is a diffeomorphism. We may 
therefore identify this double coset space with [0, 7r] . The class one represen- 
tations for SO(?i)/SO(n — 1) have highest weights, (to, 0, ... ,0), where m is a 
nonnegative integer, and the corresponding spherical functions are Gegenbauer 
polynomials in cos(0), where 9 £ [0,7r], namely 






n 

m 



T(n — 2) to! 
T(n + to — 2) ' 



C^/ 2 (c°s9). 
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See [30] for a proof of this. For fixed n, the sequence of functions Cm ~ 2 ^ 2 is a 
sequence of real orthogonal polynomials, so the sequence of functions is an 
extended Chebyshev system. 

Choose real finitely supported distributions, ipi,k, on [0, 7r], for 2 < i < k < n 
which each satisfy 

= <W for 0 < m < s. 

A lot of choices are involved here. In particular, the support, F, of 0$ & may be 
any nonempty finite subset of [0, 7r], and the order, p, of 0^*, is likewise arbitrary 
provided that (p + 1) |F| > s + 1. 

For the case n = 2, choose 02 , a- to be a real distribution supported on a finite 
subset of [0, 27 t) such that 

(i> 2 ^ = S 0m for |m| < s. 

Define 0 i>fe = (r;)*(0i,fc) for 2 <i<k< n, i.e. (0,,*,/) = f ° rk), for any 
C°° function, /, on G. Finally we can define our sampling distributions: 

C2 = Ip 2,2, 

Cn = '02, n *•"* 0ra,n * c n- 1- 

The convolution product for c n has dimSO(?r) = factors. It is clear that 

we can choose the s,;.a so that the order of c n is 0 and c n has support of size at 
most (2s + 1 )«- 1 s ( ra “ 1 )( ra - 2 )/ 2 . If we allow c n to have a higher order, then we 
can decrease the size of its support. 

Theorem 4.1. If ||A||# < s, then A A (c„ - 1) = 0. 

Proof. Let 

ft? = {A e SOfr) : ||A|U<s} 

= {A (m lin ,...,mfc iTl ) • 1 777-1 ,ri | A 

Using the embeddings C^SO^))' C' 00 (SO(?^)) , and the betweenness 

relations for the restriction of representations of SO(n) to SO(n — 1), it is ob- 
vious that Tq 2 C ••• C Toy We shall show, using induction, that c n = Cso(n) 
(mod To"), for all n. Now, from the general arguments given previously, we 
know that if we define Ck by 



C2 = 02,2, 

Ck = Ck - 1 * 1pk,k * Ck- 1, 

then c s = Cgo(fc) (mod Tq^), for all k. We need to show that c n = c n . To prove 
this, it suffices to show that if 02, • ■ • 0 n are distributions with the support of 0*, 
contained in r fc (M), and satisfying c S o(fe-i) * 0a, * c S q(a,-i) = c SO (k) (mod T n j), 
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then c n = ip 2 * rfn * c n - 1 (mod Tn^)- By induction, we assume that this is 
true for numbers less than n. Then for any i/> 2 , . . . , V’n as above, we have 

C n = Cn - 1 * Ipn * C g O(n-l) (mod Toy) 

= (-02 * • • • * V’n— 1 * C„_ 2 ) *1pn* CsO(n-l) (mod Tfiy) 

= 1p2*-'-* 4>n- 1 *4>n* C S 0(n-2) * C S 0(n-l) (mod Toy) 

= 1 p 2 * • • • * Ipn * c„_i (mod Toy), 

where we have used the facts that Cgo( n - 2 )* c so(n-i) = c SO(n-i)) and c ra _ 2 w Vv 

□ 

The distribution p so («- 1 )(^ 2i „ * ••• * ipn,n ) on S' 71-1 = SO(n)/SO(?r — 1) is 
zero on the associated spherical functions coming from representations of SO(n) 
satisfying |mi jn | < s. In [20], it is shown that a fast transform is possible for 
functions sampled on the support of this distribution. 

A similar argument leads to the parametrization of SO(?r) by Euler angles. 

4.3. Example: Sampling on SU(n). In this case, the appropriate chain of 
subgroups to use is, 

SU(n) C S(U n -! x Ui) C SU(?r - 1) C ■ ■ ■ C SiUx x U x ). 

Let rk(O) be the same matrix as was used in the case of SO(?r), but also define 
Qk{0) = Diag(e _ * e , . . . , e ~ ie , e lke , 1, . . . , 1). where there are exactly k entries of 
the form e~ l6 . Note that qk{0) ' SU(fc), that the qk generate the usual choice 

of maximal torus in SU(n), and that 

«?([/„_! x UJ = g n _ 1 ([0,27r]).SU(n-l), 

SU(n) = SiUn-! x U 1 ).r n ([0,Tr/2]).S(U n - 1 x Th). 

In fact, the map 

[0, 7t/2] - S(U n - 1 x C/ 1 )\SU(n)/5(C/ n _i x lh) : 

e i ► SiUn-! X C/i)r„(0)5(C/ n _i x lh) 

is a homeomorphism, and its restriction to (0,7 t/ 2) is a diffeomorphism. 

Let Ai,„, . . . A„,_ i be the coordinates of the highest weight of a representation 
of SU(n) relative to the basis, {e,} of the dual of the usual Cartan subalgebra, 
as given in Section 2.2.4. Then 

Al,n ^ A n _i ?n A 0. 

Representations of the group S(U n - 1 x Ui), are determined by a collection 
of numbers (Ai,„_i, . . . A n _ 2) „_i; A„_i ; „_i), where (Ai, n _i, . . . A„_ 2 , n _i) is the 
highest weight of the restriction to SU(n— 1), and A n _i irl _i is the weight of the 
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restriction to the subgroup (K). The relations giving the representations of 
S{U n _i x Ui) arising are 

Al, n -1 = Mi — Mn-1> 



An— 2,n— 1 — Mn-2 Mn— 

n — 1 n — 1 

An— l,n— 1 = (R 1) ^ \ R ^ , Mj ) 

j=l t=l 

where the /ij are integers satisfying 

Al,n ^ Ml — A 2 ,n '' - ■ ■ A n — l,n . Mn— 1 ^ b- 

In the case n = 2 the appropriate relation is Ai ,2 > (Aipl , where A 1>2 — Ai i 
must be even. To restrict to SU(n — 1) from S(U n _i x U\) simply throw away 
A n -i, n -i- If we now define for to > 2 

(1? = {A a : ||A|| ff < s} = {A (Ai m ,...,A m _ lim ) : ^i,m < 4 
^r 1 = {A ( A;A m _ 1|m _i) ; ll A llff < |A m _i jm _i| < (to - 1)4 

= {A(A 1|m _ 1 ,...,A m _2,m-i;A m _i, m _i) ; Al, m -1 < S, |A m -l,m-l| < (to — l)s} 
= {A(A!,i) : l A l,l| ^ S l’ 
then using the embeddings 

C°° i^S{U\ x Ur))' W C ,00 (SU(2))' W W C°° (S(U n —\ x Ur))' ^ C°°(SU(n)y 
and the restriction relations given above, we see that 



‘Ipi ClffC- T 0 .-1 C C Tn;; 



The class 1 representations of SU(?r) relative to S(U n - 1 x Ui) have highest 
weights of the form (2 to, to, . . .), where to. > 0, and using the map [0, 7 t/2] < — > 
S(U n -i x Ui)\SXJ(n)/S(U n -i x Ui) specified above, have corresponding spherical 
functions which are Jacobi polynomials in cos2M, 






n 

m 



(n — 2 )!to! 
(n + to — 2)! 



.P”- 2 ’ 0 (cos2M). 



For a proof of this, see [20] . 

For 2 < i < k < n choose be a real finitely supported distribution, 
on [0,7t/2], that satisfies = 4 ,m for 0 < to. < |_§J- For 1 < j < 

k < n, choose a real finitely supported distribution, ( h k, on [0, 27 t) that satisfies 
(Cj,kj e*m( )) = 6o,m for |to| < jb. Define Cn-i n i n the same way, with j = n— 1. 
Then set 4,fc = {ri)*(4>i,k), Cj,k = {Qj)*(Cj,k), and define C n -i,n similarly. Finally 
define 

c 2 = Cl, 2 * 4 > 2,2 * Cl, 2) 

C n = (Cl ,n * 1p2 ,n) * ’ ’ ’ * (Cn-l,n * 1pn,n) * Cn- l,n * C n-1- 
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Theorem 4.2. c n = c SU (n) ( m od X Q? ) and c n *(' n _ l n = c S (u nXUl ) (mod X^„). 

PROOF. We use induction on n. It suffices to show that if the i^k are distributions 
supported on r fc (R), and Cfc,Cfc satisfy c 5(c/)c _ lxC/l) * ip k * Cs(c/ fc _ixE/i) = c SU (fc), 
and Cfc = Cfc = c gfc ( R ) modulo Toy, then 

Csu(n) = (Cl * '02) * • • • * (Cn-2 * 0n-l) * Cn-2 * c SU(n-l) (mod Tf 2 y). 

By induction, we can assume this holds for numbers less than n. Let Q„ be the 
subgroup of SU(n) given by Q n = {Diagted"- 1 )^ e~ ie , . . . e~ ie ) : 6 G R}, and 
note that Cn -2 * c SU (n- 2 ) * Cn-i = Cn— l * c SU (n- 2 ) * c Qrl (mod Toy). Therefore, 
working modulo Toy , we have 

CSU(n) = Csu(n-l) * Cn-1 * 0n,n * Cn -1 * Csu(n-l) 

= Cl * 02 * • • • * Ipn- 1 * (Cn-2 * c SU(n-2) * Cn-l) * 0n * Cn-1 * c SU(n-l) 

= Ci * • • • * 0n- 1 * (Cn-1 * Csu(n-2) * CQ n ) * 0„ * Cn-1 * c SU(n-l) 

= Ci * ' ' ' * 0n- 1 * Cn-1 * 0n * CSU(n-2) * CQ n * Cn-1 * c SU(n-l) 

= Cl * 02 * ’ ’ ’ * 0n— 1 * Cn-1 * 0n,n * Cn-l,n * c SU(n-l) 5 
where we used the fact that Q n C S{U n -\ x U\). □ 

The distribution, P SU(n ” 1) (Cl,n * 02, n * • ■ ■ * Cn-l,n * 0n,n * Cn— l,n)> on S' 2 ”” 1 = 
SU(n — l)/SU(n — 1), is zero on associated spherical functions coming from 
representations whose highest weight, (Ai >ra , . . . , A n _i : „), satisfies Ai,„ < b. In 
[20] is is shown how to perform fast transforms for functions sampled on the 
support of this distribution. By commutativity, (Ci,n*02,n)*’ ■ •*(Cn-i,n*0n,n) — 
(Ci,n * • • • * Cn— i,n) * 02, n * • ■ ■ 0n,n), so by replacing Cl,n * • • • * Cn-l,n by an 
appropriate distribution on the maximal torus of SU(n), we can obtain yet more 
distributions on SU(n), which satisfy the above theorem. 

The same commutativity relations can applied to the subgroups qi and rj of 
SU(n). This yields a parametrization of SU(n), which is analogous to the Euler 
angles for SO(n). 

4.4. Example: Sampling on Sp(n). Sp(n) = {^4 € Af„(H) : A* A = Id}, 
where H denotes the division ring of quaternions. By elementary geometry, one 
can see that Sp(n)/(Sp(n — 1) x Sp(l)) is isomorphic to the right quaternionic 
projective space, P”” 1 !! and that the map 



[0, 7t/ 2] -> (Sp(n - 1) x Sp(l))\ Sp(n)/(Sp(n - 1) x Sp(l)) 

: 9 i ► (Sp(n - 1) x Sp(l)).r„(0).(Sp(n - 1) x Sp(l)) 

is a liomeomorphism, and its restriction to (0,7r/2) is a diffeomorphism. Note 
that Sp(l) <-> SU(2). 



Let 



R n = 



a £ Sp(l) 
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so that Sp(?r — 1) x Sp(l) = Sp(?r — 1 ).R n . 

Working in the basis {e,} of Section 2.2.4, the highest weights of representa- 
tions of Sp(n) are determined by integers m^n, . . . , to„ iTI , where 



mi,„ >■••■> m n>n > 0. 



The highest weights, v = (toi ; „_i, . . . , m n _ i, n _i), of those representations occur- 
ring in the restriction of the representation, A( mi „), of Sp(n) to Sp(n— 1) 

satisfy 

Pi > mi,n-i >P2> -> m„_ i,„_i > p n , 



where 



mi,n >Pl> -> m n , n >Pn> 0, 

but the corresponding multiplicities may be greater than one. The restriction of 
A to Sp(n - 1) x Sp(l) is precisely 



E 




) A 






i= 1 



i}— max{mi, 







where to„+ i jn = TO n ,n-i = 0, m-o.n-i = +oo, and v ranges over the highest 
weights of irreducible representations of Sp(n) appearing in the restriction of 
A t° Sp(n— 1); see [33]. Hence, highest weights, m, of the represen- 
tations occurring in the restriction from Sp(n) to R n satisfy m\ n > to. It should 
be clear then, that if we define, for any positive integer s, 



= {Aa : ||A|| ff < s} = {A (m 1 ,„,...,m n , n ) : m lt „ < s}, 



SU('2‘) 

then T^i Q ■ ■ ■ C Toy- Also, let fi s be the set of all irreducible representa- 
tions, A m , of SU(2) such that 0 < m < b, and denote the corresponding set of 
representations of R n by f 2f n . Using the embedding C°°{R n )' =— > C^Sp^r))', 
we see that T n R„ C Toy . 

For any 1 < k < n, we can construct, using previous techniques, a finitely 
supported measure, ffc >ra , on <-» SU(2), such that Vk, n = CR k (mod T^r,.). 
Now assume that n > 2. The class one representations of Sp(n) relative to 
Sp(?r — 1) x Sp(l) have highest weights of the form (to, to, 0, . . .), where to is a 
nonnegative integer, and the corresponding spherical functions can be written 
using the map [0, 7r/2] — > (Sp(n — 1) x Sp(l))\ Sp(n)/(Sp(n — 1) x Sp(l)), in the 
form 






n 

m 



(2 n — 3)!to! 
(to + 2n — 3)! 



.P^-3,1 (cos20) . 



For a proof of this, see [15]. Let ipk,n be a real finitely supported distribution 
on [0,7t/2] that satisfies = ,m for 0 < to < s, and set ipk, n = 

(fk)*(ipk,n)- Then define c n inductively by 



Cl = 1>1,1, 

c ra = l>l, n * (V^.n * l>2,n) *■■■* ( 1pn,n * V n ,n) * C„_ 1- 
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This finitely supported measure is the convolution product of dimSp(?i) = 2 n 2 + 
n factors each supported on a 1-parameter subgroup of Sp(n), and it is easy to 
prove the following theorem. 

Theorem 4.3. c n = cs p ( n ) (mod Toy). 

PROOF. Similar to the SO(n) and SU(n) cases. □ 
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Abstract. In 1965 J. Cooley and J. Tukey published an article detailing an 
efficient algorithm to compute the Discrete Fourier Transform, necessary 
for processing the newly available reams of digital time series produced 
by recently invented analog-to-digital converters. Since then, the Cooley- 
Tukey Fast Fourier Transform and its variants has been a staple of digital 
signal processing. 

Among the many casts of the algorithm, a natural one is as an efficient 
algorithm for computing the Fourier expansion of a function on a finite 
abelian group. In this paper we survey some of our recent work on he 
“separation of variables” approach to computing a Fourier transform on an 
arbitrary finite group. This is a natural generalization of the Cooley-Tukey 
algorithm. In addition we touch on extensions of this idea to compact and 
noncompact groups. 



Pure and Applied Mathematics: Two Sides of a Coin 

The Bulletin of the AMS for November 1979 had a paper by L. Auslander and 
R. Tolimieri [3] with the delightful title “Is computing with the Finite Fourier 
Transform pure or applied mathematics?” This rhetorical question was answered 
by showing that in fact, the finite Fourier transform, and the family of efficient 
algorithms used to compute it, the Fast Fourier Transform (FFT), a pillar of 
the world of digital signal processing, were of interest to both pure and applied 
mathematicians. 
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Auslander had come of age as an applied mathematician at a time when pure 
and applied mathematicians still received much of the same training. The ends 
towards which these skills were then directed became a matter of taste. As 
Tolimieri retells it (private communication), Auslander had become distressed 
at the development of a separate discipline of applied mathematics which had 
grown apart from much of core mathematics. The effect of this development 
was detrimental on both sides. On the one hand applied mathematicians had 
fewer tools to bring to problems, and conversely, pure mathematicians were often 
ignoring the fertile bed of inspiration provided by real world problems. Auslander 
hoped their paper would help mend a growing perceived rift in the mathematical 
community by showing the ultimate unity of pure and applied mathematics. 

We will show that investigation of finite and fast Fourier transforms contin- 
ues to be a varied and interesting direction of mathematical research. Whereas 
Auslander and Tolimieri concentrated on relations to nilpotent harmonic analy- 
sis and theta functions, we emphasize connections between the famous Cooley- 
Tukey FFT and group representation theory. In this way we hope to provide 
further evidence of the rich interplay of ideas which can be found at the nexus 
of pure and applied mathematics. 

1. Background 

The finite Fourier transform or discrete Fourier transform (DFT) has several 
representation theoretic interpretations: either as an exact computation of the 
Fourier coefficients of a function on the cyclic group Z/nZ or a function of band- 
limit n on the circle S 1 , or as an approximation to the Fourier transform of a 
function on the real line. For each of these points of view there is a natural group- 
theoretic generalization, and also a corresponding set of efficient algorithms for 
computing the quantities involved. These algorithms collectively make up the 
Fast Fourier Transform or FFT. 

Formally, the DFT is a linear transformation mapping any complex vector of 
length n , / = (/(0) . . . , /(n — l)) 4 G C n , to its Fourier transform, f G C™. The 
k th component of /, the DFT of f at frequency k, is 

n— 1 

m = E Me 2 * ijk/n (i-i) 

j=0 

where i = y/—l, and the inverse Fourier transform is 

1 n— 1 

fU) = - E me~ Mjk/n . (1-2) 

k = o 

Thus, with respect to the standard basis, the DFT can be expressed as the 
matrix-vector product / = F„ • / where F n is the Fourier matrix of order n, 
whose j, k entry is equal to e 27r * jfe /". Computing a DFT directly would require n 2 
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scalar operations. (For precision’s sake: Our count of operations is the number 
of complex additions of the number of complex multiplications, whichever is 
greater.) Instead, the FFT is a family of algorithms for computing the DFT of 
any / £ C n in 0(?ilogn) operations. Since inversion can be framed as the DFT 
of the function f(k) = -/(— fc), the FFT also gives an efficient inverse Fourier 
transform. 

One of the main practical implications of the FFT is that it allows any cycli- 
cally invariant linear operator to be applied to a vector in only 0(n\ogn) scalar 
operations. Indeed, the DFT diagonalizes any group invariant operator, making 
possible the following algorithm: (1) compute the Fourier transform (DFT). (2) 
Multiply the DFT by the eigenvalues of the operator, which are also found using 
the Fourier transform. (3) Compute the inverse Fourier transform of the result. 
This technique is the basis of digital filtering and is also used for the efficient 
numerical solution of partial differential equations. 



Some history. Since the Fourier matrix is effectively the character table of 
a cyclic group, it is not surprising that some of its earliest appearances are in 
number theory, the subject which gave birth to character theory. Consideration 
of the Fourier matrix goes back at least as far as to Gauss, who was interested 
in its connections to quadratic reciprocity. In particular, Gauss showed that for 
odd primes p and g, 

( <l\ _ Trace¥ pq 

\p ) Trace F p Trace ’ 

where (-) denotes the Legendre symbol. Gauss also established a formula for 
the quadratic Gauss sum Trace F n , which is discussed in detail in [3]. 

Another early appearance of the DFT occurs in the origins of representation 
theory in the work of Dedekind and Frobenius on the group determinant. For 
a finite group G, the group determinant Og is defined as the homogeneous 
polynomial in the variables x g (for each g £ G) given by the determinant of 
the matrix whose rows and columns are indexed by the elements of G with g, h- 
entry equal to Xg^-i. Frobenius showed that when G is abelian, 0,^ admits the 
factorization 

0G = II ( > ( 1_4 ) 

where G is the set of characters of G. The linear form defined by the inner sum 
in (1-4) is a “generic” DFT at the frequency y. 

In the nonabelian case, Qq admits an analogous factorization in terms of 
irreducible polynomials of the form 




e D (G) =det 



D (9)x g 

gCG 
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where D is an irreducible matrix representation of G. The inner sum here is a 
generic Fourier transform over G. See [12] for a beautiful historical exposition 
of these ideas. 

Gauss’s interests ranged over all areas of mathematics and its applications, so 
it is perhaps not surprising that the first appearance of an FFT can also be traced 
back to him [10]. Gauss was interested in certain astronomical calculations, a 
recurrent area of application of the FFT, necessary for interpolation of asteroidal 
orbits from a finite set of equally-spaced observations. Surely the prospect of a 
huge laborious hand calculation was good motivation for the development of a 
fast algorithm. Making fewer hand calculations also implies less opportunity for 
error and hence increased numerical stability! 

Gauss wanted to compute the Fourier coefficients, ak,bk of a function repre- 
sented by a Fourier series of bandwidth n, 

m m 

/(*) = ak cos 2nkx + bk sin 2nkx, (1-5) 

k = o k= 1 

where m = (n — 1) /2 for n odd and m = n/2 for n even. He first observed 
that the Fourier coefficients can be computed by a DFT of length n using the 
values of / at equispaced sample points. Gauss then went on to show that if 
n = niri 2 , this DFT can in turn be reduced to first computing ni DFTs of length 
ri 2 , using equispaced subsets of the sample points, i.e., a subsampled DFT, and 
then combining these shorter DFTs using various trigonometric identities. This 
is the basic idea underlying the Cooley-Tukey FFT. 

Unfortunately, this reduction never appeared outside of Gauss’s collected 
works. Similar ideas, usually for the case n\ = 2 were rediscovered intermit- 
tently over the succeeding years. Notable among these is the doubling trick of 
Danielson and Lanczos (1942), performed in the service of x-ray crystallography, 
another frequent employer of FFT technology. Nevertheless, it was not until the 
publication of Cooley and Tukey’s famous paper [7] that the algorithm gained 
any notice. The story of Cooley and Tukey’s collaboration is an interesting one. 
Tukey arrived at the basic reduction while in a meeting of President Kennedy’s 
Science Advisory Committee where among the topics of discussions were tech- 
niques for off-shore detection of nuclear tests in the Soviet Union. Ratification 
of a proposed United States/Soviet Union nuclear test ban depended upon the 
development of a method for detecting the tests without actually visiting the 
Soviet nuclear facilities. One idea required the analysis of seismological time 
series obtained from off-shore seismometers, the length and number of which 
would require fast algorithms for computing the DFT. Other possible applica- 
tions to national security included the long-range acoustic detection of nuclear 
submarines. 

R. Garwin of IBM was another of the participants at this meeting and when 
Tukey showed him this idea Garwin immediately saw a wide range of potential 
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applicability and quickly set to getting this algorithm implemented. Garwin was 
directed to Cooley, and, needing to hide the national security issues, told Cooley 
that he wanted the code for another problem of interest: the determination 
of the periodicities of the spin orientations in a 3-D crystal of He 3 . Cooley 
had other projects going on, and only after quite a lot of prodding did he sit 
down to program the “Cooley-Tukey” FFT. In short order, Cooley and Tukey 
prepared their paper, which, for a mathematics/computer science paper, was 
published almost instantaneously — in six months!. This publication, Garwin’s 
fervent proselytizing, as well as the new flood of data available from recently 
developed fast analog-to-digital converters, did much to help call attention to 
the existence of this apparently new fast and useful algorithm. In fact, the 
significance of and interest in the FFT was such that it is sometimes thought 
of as having given birth to the modern field of analysis of algorithms. See also 
[6] and the 1967 and 1969 special issues of the IEEE Transactions in Audio 
Electronics for more historical details. 

The Fourier transform and finite groups. One natural group-theoretic 
interpretation of the Fourier transform is as a change of basis in the space of 
complex functions on Z / nZ . Given a complex function / on Z / nZ , we may 
expand /, in the basis of irreducible characters {xa,}, defined by Xkij) = e 27rljfc / n . 
By (1-2) the coefficient of Xk in the expansion is equal to the scaled Fourier 
coefficient whereas the Fourier coefficient f(k) is the inner product of 

the vector of function values of / with those of the character \k- 

For an arbitrary finite group G there is an analogous definition. The characters 
of Z /nZ are the simplest example of a matrix representation, which for any group 
G is a matrix-valued function p(g) on G such that p(ab) = p(a)p(b), and p(e) 
is the identity matrix. Given a matrix representation p of dimension d p , and 
a complex function / on G, the Fourier transform of f at p is defined as the 
matrix sum 

T(p ) = f( x )p( x )- ( 1_6 ) 

x€G 

Computing f(p) is equivalent to the computation of the d 2 p scalar Fourier trans- 
forms at each of the individual matrix elements pij, 

f(pij) = f( x )Pa( x )- ( 1_7 ) 

xGG 

A set of matrix representations 1Z of G is called a complete set of irreducible 
representations if and only if the collection of matrix elements of the represen- 
tations, relative to an arbitrary choice of basis for each matrix representation 
in the set, forms a basis for the space of complex functions on G. The Fourier 
transform of / with respect to TZ is then defined as the collection of individual 
transforms, while the Fourier transform on G means any Fourier transform com- 
puted with respect to some complete set of irreducibles. In this case, the inverse 
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transform is given explicitly as 

/O) = t ^7 ^2 d p Trac e(/(p)p(a:" 1 )). (1-8) 

' ' p£TZ 

Equation (1-8) shows us a relation between the group Fourier transform and the 
expansion of a function in the basis of matrix elements. The coefficient of pij 
in the expansion of / is the Fourier transform of f at the dual representation 
[pji{g~ l )\ scaled by the factor d p / |G|. 

Viewing the Fourier transform on G as a simple matrix-vector multiplication 
leads to some simple bounds on the number of operations required to compute 
the transform. The computation clearly takes no more than the |G| scalar 
operations required for any matrix- vector multiplication. On the other hand the 
column of the Fourier matrix corresponding to the trivial representation is all 
ones, so at least |G| — 1 additions are necessary. One main goal of this finite 
group FFT research is to discover algorithms which can significantly reduce the 
upper bound for various classes of groups, or even all finite groups. 

The current state of affairs for finite group FFTs. Analysis of the Fourier 
transform shows that for G abelian, the number of operations required is bounded 
by 0(|G| log |G|). For arbitrary groups G, upper bounds of 0(|G| log |G|) remain 
the holy grail in group FFT research. In 1978, A. Willsky provided the first non- 
abelian example by showing that certain metabelian groups had an 0(|G| log |G|) 
Fourier transform algorithm [20]. Implicit in the big-O notation is the idea that 
a family of groups is under consideration, with the size of the individual groups 
going to infinity. 

Since Willsky’s initial discovery much progress has been made. U. Baum has 
shown that the supersolvable groups admit an 0(|G|log|G|) FFT, while others 
have shown that symmetric groups admit 0(|G|log 2 |G|) FFTs (see Section 3). 
Other groups for which highly improved (but not 0(|G| log c | G| ) ) algorithms have 
been discovered include the matrix groups over finite fields, and more generally, 
the Lie groups of finite type. See [15] for pointers to the literature. There is much 
work to be done finding new classes of groups which admit fast transforms, and 
improving on the above results. The ultimate goal is to settle or make progress 
on the following conjecture: 

Conjecture 1. There exist constants C\ and Ci such that for any finite group 
G, there is a complete set of irreducible matrix representations for which the 
Fourier transform of any complex function on the G may be computed in fewer 
than Ci|G|log° 2 |G| scalar operations. 

2. The Cooley— Tukey Algorithm 

Cooley and Tukey showed [7] how the Fourier transform on the cyclic group 
Z/nZ, where n = pq is composite, could be written in terms of Fourier transforms 
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on the subgroup gZ/nZ = Z /pZ. The trick is to change variables, so that the one 
dimensional formula (1-1) is turned into a two dimensional formula, which can 
be computed in two stages. Define variables ji,j 2 , k\ . k 2 , through the equations 



3 = 3 (ii , h) = jiq + 32, 0<ji<p, o < j -2 < q, 

k = k(ki, k 2 ) = k 2 p + k\, 0 < k\ < p, 0 < k 2 < q. 

It follows from these equations that (1-1) can be rewritten as 

< 7-1 p - 1 

f(ki,k 2 ) = E e 2irij 2 (k 2 p+k 1 ) /n E e 2 "ijiki/PffaJJ. 

32=0 ji=0 



(2-1) 



(2-2) 



We now compute / in two stages: 



• Stage 1: For each k\ and j 2 compute the inner sum 

p - 1 

f(k u j 2 ) = ^ ihkllP f(3i,32). (2-3) 

3 1=0 

This requires at most p 2 q scalar operations. 

• Stage 2: For each k\,k 2 compute the outer sum 

q-l 

f(ki,k 2 ) = J2 uh)- (2-4) 

32=0 

This requires an additional q 2 p operations. 

Thus, instead of (pq) 2 operations, the above algorithm uses (pq)(p+q) operations. 

Stage 1 has the form of a DFT on the subgroup gZ/nZ = Z/pZ, embedded 
as the set of multiples of q , whereas stage 2 has the form of a DFT on a cyclic 
group of order q , so if n could be factored further, we could apply the same trick 
to these DFTs in turn. Thus, if N has the prime factorization N = pi • • • p m , 
then we recover Cooley and Tukey’s original m-stage algorithm which requires 
NJ2iPi operations [7]. 

A group-theoretic interpretation. Auslander and Tolmieri’s paper [3] re- 
lated the Cooley-Tukey algorithm to the Weil-Brezin map for the finite Heisen- 
berg group. Here we present an alternate group-theoretic interpretation, origi- 
nally due to Beth [4], that is more amenable to generalization. 

The change of variables on the first line of (2-1) may be interpreted as the 
factorization of the group element j as the (group) product of jiq £ qL/rCL , 
with the coset representative j 2 . Thus, if we write G = Z/nZ, H = gZ/nZ, 
and let Y denote our set of coset representatives, the change of variables can be 
rewritten as 

g = y-h, y £ Y, h £ H (2-5) 

The second change of variables in (2-1) can be interpreted using the notion of 
restriction of representations. It is easy to see that restricting a representation 
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on a group G to a subgroup H yields a representation of that subgroup. In the 
case of qh jnL this amounts to the observation that 

e 2nijiq(k2P+ki)/n _ ^.nijiki/p 

which is used to prove (2-2). 

The restriction relations between representations may be represented diagra- 
matically using a directed graded graph with three levels. At level zero there is 
a single vertex labeled 1, called the root vertex. The vertices at level one are 
labeled by the irreducible representations of Z/pZ, and the vertices at level two 
are labeled by the irreducible representations of Z/nZ. Edges are drawn from 
the root vertex to each of the vertices at level one, and from a vertex at level one 
to a vertex at level two if and only if the representation at the tip restricts to the 
representation at the tail. The directed graph obtained is the Bratteli diagram 
for the chain of subgroups Z/nZ > Zp/Z > 1. Figure 1 shows the situation for 
the chain Z/6Z > 2Z/6Z = Z/3Z > 1. 



Z/6Z 2Z/6Z 1 




Figure 1. The Bratteli diagram for Z/6Z > 2Z/6Z > 1. The representation \k 

of Z/mZ is defined by Xk{l) = e 2ntkl ^ rn . 

In this way the irreducible representations of Z /nZ are indexed by paths 
(fc-i , fc 2 ) in the Bratteli diagram for Z/nZ > Z/pZ > 1. The DFT factorization 
(2-2) now becomes 

f(ki,k 2 ) = Y Xk 1} k 2 (y ) Y ' h )XkM)- (2-6) 

y£Y h£H 

The two-stage algorithm is now restated as first computing a set of sums that 
depend on only the first leg of the paths, and then combining these to compute 
the final sums that depend on the full paths. 

In summary, the group elements have been indexed according to a particular 
factorization scheme, while the irreducible representations (the dual group) are 
now indexed by paths in a Bratteli diagram, describing the restriction of repre- 
sentations. This allows us to compute the Fourier transform in stages, using one 
fewer group element factor at each stage, but using paths of increasing length in 
the Bratteli diagram. 
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3. Fast Fourier Transforms on Symmetric Groups 

A fair amount of attention has been devoted to developing efficient Fourier 
transform algorithms for the symmetric group. One motivation for developing 
these algorithms is the goal of analyzing data on the symmetric group using a 
spectral approach. In the simpler case of time series data on the cyclic group, 
this approach amounts to projecting the data vector onto the basis of complex 
exponentials. 

The spectral approach to data analysis makes sense for a function defined 
on any kind of group, and such a general formulation is due to Diaconis (see 
[8], for example). The case of the symmetric group corresponds to considering 
ranked data. For instance, a group of people might be asked to rank a list of 4 
restaurants in order of preference. Thus, each respondent chooses a permutation 
of the original ordered list of 4 objects, and counting the number of respon- 
dents choosing each permutation yields a function on S4. It turns out that the 
corresponding Fourier decomposition of this function naturally describes various 
coalition effects that may be useful in describing the data. 

To get some feel for this notice that the Fourier transform at the matrix 
elements pij{ ir) of the (reducible) defining representation count the number of 
people ranking restaurant i in position j. If instead p is the (reducible) permuta- 
tion representation of S n on unordered pairs {i,j}, then for each choice of {i, j} 
and {fc, /} the individual Fourier transforms count the number of respondents 
ranking restaurants i and j in positions k and l. See [8] for a more thorough 
explanation. 

The first FFT for symmetric groups (an 0(|G|log 3 |G|) algorithm) was due 
M. Clausen. In what follows we summarize recent improvements on Clausen’s 
result. 

Example: Computing the Fourier transform on S4. The fast Fourier 
transform for S4 is obtained by mimicking the group-theoretic approach to the 
Cooley-Tukey algorithm. More precisely, we shall rewrite the formula for the 
Fourier transform using two changes of variables: one using factorizations of 
group elements, and the other using paths in a Bratteli diagram. The former 
comes from the reduced word decomposition of g G S4, by which g may be 
uniquely expressed as 

g = s\ ’ s 3 ’ s 4 ’ 4 ' 4 ’ s 2> (3-1) 

where s \ is either e or the transposition (i i — 1), and ,s) | = e implies that sj 2 = e 
for 12 < i\. Thus any function on the group S4 may be thought of as a function 
of the 6 variables s|, sf, s|, s 3 , s 3, s 2- 

To index the matrix elements of 64 paths in a Bratteli diagram are used, this 
time relative to the chain of subgroups £4 > S3 > S'2 > Sj > 1. The irreducible 
representations of S n are in one-to-one correspondence with partitions of the 
integer n, with restriction of representations corresponding to deleting a box in 
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the Young diagram. The corresponding Bratteli diagram is called Young’s lattice, 
and is shown in Figure 2. Paths in Young’s lattice from the empty partition <j> 




Figure 2. Young's lattice up to level 4. 



to 0 4 , a partition of 4, index the basis vectors of the irreducible representation 
corresponding to 0 4 . Matrix elements, however, are determined by specifying a 
pair of basis vectors, so to index the matrix elements, we must use pairs of paths 
in Young’s lattice, starting at <f> and ending in the same partition of 4. Since 
there are no multiple edges in Young’s lattice, each path may be described by 
the sequence of partitions </>, 0\, 02, 03, 04, through which it passes. 

Before we can state a formula for the Fourier transform, analogous to (2-2) 
and (2-6), we must choose bases for the irreducible representations of S 4 in 
order to define our matrix elements. Efficient algorithms are known only for 
special choices of bases, and our algorithm uses the representations in Young’s 
orthogonal form, which is equivalent to the following equation (3-2) for the 
Fourier transform in the new sets of variables. 



04 03 02 /?l\ 
7 V 73 72 7 i/ 



- E E 

p=S^ s 4 s 4 s 3 s 3 s 2 (p2 ,T]l 




.73 ¥>2/ 3 



.V>2<PlJ 2 




xP 3 a 




(3-2) 



The functions P l , in equation (3-2) are defined below, and for each i, the vari- 
ables 0i,^i,(pi, r)i are partitions of i, satisfying the restriction relations described 
by Figure 3. A solid line between partitions means that the right partition is 
obtained from the left partition by removing a box. 

The relationship between (3-2) and Figure 3 is extremely close — we derived 
the diagram from the reduced word decomposition first, and then read the equa- 
tion off the diagram. Each 2-cell in Figure 3 corresponds to a factor in the 
product of P functions in (3-2), and the labels on the boundary of each cell 
give the arguments of P l j . The sum in (3-2) is over those variables occurring in 
the interior of Figure 3. Thus, the variables describing the Fourier transformed 
function are exactly those appearing on the boundary of the figure. 
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0i 




Figure 3. Restriction relations for (3-2). 



Equation (3-2) can be summarized by saying that we take the product over 
2-cells, and sum on interior indices, in Figure 3. This suggests a generalization of 
the Cooley- Tukey algorithm, that corresponds to building up the diagram one 
cell at a time. At each stage multiply by the factor corresponding to a 2-cell, 
and form the diagram consisting of those 2 -cells that have been considered so far. 
Then sum over any indices that are in the interior of the diagram for this stage, 
but were not in the interior for previous stages. At the end of this algorithm we 
have multiplied by the factors for each 2 -cell, and summed over all the interior 
indices, and have therefore computed the Fourier transform. 

The order in which the cells are added matters, of course. The order s|, s^, 
s !> s 2 > s 3 > s 4 known to be most efficient. Here is the algorithm in detail. 



Stage 0: Start with f (s^s^sfs^sls^) , for all reduced words. 

Stage 1: Multiply by P\. Sum on s%. 

Stage 2: Multiply by P\. Sum on s|. 

S 2 

Stage 3: Multiply by P\. Sum on m , s§. 

S 3 

Stage 4: Multiply by P\. Sum on st. 

Stage 5: Multiply by P 2 4 . Sum on 
Stage 6 : Multiply by P^ 4 . Sum on <fi 2 ,sj. 



The indices occurring in each stage of the algorithm are shown in Figure 4. 



To count the number of additions and multiplications used by the algorithm, 
we must count the number of configurations in Young’s lattice corresponding to 
each of the diagrams in Figure 4. This yields a grand total of 130 additions and 
130 multiplications for the Fourier transform on £ 4 . 

The generalization to higher order symmetric groups is straightforward. The 
reduced word decomposition gives the group element factorization and Young’s 
orthogonal form allows us to change variables, and the formula and algorithm 
for the Fourier transform can be read off a diagram generalizing Figure 3. The 
diagram for S 5 is shown, for example, in Figure 5. 




292 



DAVID K. MASLEN AND DANIEL N. ROCKMORE 




Figure 4. Variables occurring at each stage of the fast Fourier transform for S 4 




Figure 5. Restriction relations in the Fourier transform formula for S5. 



We have computed the exact operation counts for symmetric groups S n with 
n < 50, and a general formula seems hard to come by. (Presumably n < 50 would 
cover all cases where the algorithm might ever be implemented, but the same 
numbers arise in FFTs on homogeneous spaces, which have far fewer elements.) 

However, bounds are easier to obtain: 



Theorem 3.1 ([13]). The number of additions (or multiplications ) required by 
the above algorithm (as generalized to S n > S n _ 1 > • • • > Si) is exactly 



n\ 




(i-1 )! Fi 



where Fi is the number of configurations in Young ’s lattice of the form 



A 




(3-3) 
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Furthermore, F t < 3(1 — i)z!, so the number of additions ( midtiplications ) is 
bounded by | n[n — 1) • nl. 

Why stop at S n ? The algorithm for the FFT on S n generalizes to any wreath 
product S n [G] with the symmetric group. The subgroup chain is replaced by 
the chain 



S n [G] > 5„_i[G] x G > 5„_i[G] > • • • > S 2 [G] > G x G > G, 
and the reduced word decomposition is replaced by the factorization 



x = s 2 ' ' ' s n9 n S2 1 ---s™_{g n 1 ---s 2 s g 2 g 1 . 



(3-4) 

(3-5) 



Adapting the S n argument along these lines gives the following new result. 



Theorem 3.2. The number of operations needed to compute a Fourier transform 
on S n [G] is at most 

( 3 n{n ~ l) \G\d% + n{t G + \\G\(h G d 2 G - |G|))) \S n [G\\ 

where h G is the number of conjugacy classes in G, d G is the maximal degree of 
an irreducible representation of G, and t G is the number of operations required 
to compute a Fourier transform on G. If G is abelian, then the inner term 
h G d 2 G - |G| = 0. 

The functions P'f defining Young’s orthogonal form are defined as follows: For 
any two boxes b\ and 62 in a Young diagram, we define the axial distance from 
bi to &2 to be d(i>i,& 2 )j where 4(&i,6 2 ) = row(6i) — row(62) + columnar) — 
column (6 2 ). Now suppose Pi, /3j-i, cti-i, ctj-2 are partitions and that 
are obtained from Pi by removing a box, and are obtained from 2 by adding 
a box. Then the skew diagrams of /3, — pi - 1 and P,-i — a *_ 2 each consist of a 
single box, and P' is given by 

pi f Pi A-i\ = ( 1 if Oii-i = Pi-i, 

e \Qj_i a,:_ 2 / \ 0 if oii-i ^ Pi-i- 

pi f Pi Pi-i\ _ f d(Pi - pi-i, Pi-i - ai-2)' 1 if a-i-i = Pi-i, 

l ~ 1) \Oj_i ati- 2) X y /1 - d(Pi - Pi-1, Pi-1 - ai - 2 )~ 2 if cci-i ^ Pi- 1- 

(3-6) 

For a proof of this formula, in slightly different notation, see [11], Chapter 3. 



4. Generalization to Other Groups 

The FFT described for symmetric groups suggests a general approach to com- 
puting Fourier transforms on finite groups. Here is the recipe. 

(i) Choose a chain of subgroups 

G = G m > G m -i > ■ • • > Gi > Go = 1 



(4-1) 
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for the group. This determines the Bratteli diagram that we will use to index 
the matrix elements of G. In the general case, this Bratteli diagram may have 
multiple edges, so a path is no longer determined by the nodes it visits. 

(ii) Choose a factorization g = g n ■ g n - 1 ■ ■ ■ gi of each group element g. Choose 
the gi so that they lie in as small a subgroup Gk as possible, and commute 
with as large a subgroup Gi as possible. 

(iii) Choose a system of Gel’fand-Tsetlin bases [9] for the irreducible represen- 
tations of G relative to the chain (4-1). These are bases that are indexed 
by paths in the Bratteli diagram, that behave well under restriction of rep- 
resentations. Relative to such a basis, the representation matrices of gi will 
be block diagonal whenever gi lies in a subgroup from the chain, and block 
scalar whenever gi commutes with all elements of a subgroup from the chain. 

(iv) Now write the Fourier transform in coordinates, as a function of the pairs of 
paths in the Bratteli diagram with a common endpoint, and with the original 
function written as a function of g±, . . . ,g n . This will be a sum of products 
indexed by edges in the Bratteli diagram which lie in some configuration 
generalizing (3). This configuration of edges specifies the way in which the 
nonzero elements of the representation matrices appear in the formula for the 
Fourier transform in coordinates. 

(v) The algorithm proceeds by building up the product piece by piece, and 
summing on as many partially indexed variables as possible. 

Further considerations and generalizations. The efficiency of the above 
approach, both in theory, in terms of algorithmic complexity, and practice, in 
terms of execution time, depends on both the choice of factorization and the 
Gel’fand-Tsetlin bases. In particular, very interesting work of L. Auslander, R. 
Johnson and J. Johnson [2] shows how in the abelian case, different factorizations 
correspond to different well-known FFTs, each well suited for execution on a 
different computer architecture. This work shows how to relate the 2-cocycle of 
a group extension to construction of the important “twiddle factor” matrix in 
the factorization of the Fourier matrix. It marks the first appearances of group 
cohomology in signal processing and derives an interesting connection between 
group theory and the design of retargetable software. 

The analogous questions for nonabelian groups and other important signal 
processing transform algorithms, that is, the problem of finding architecture- 
optimized factorizations, is currently being investigated by the SPIRAL project 
at Carnegie Mellon [19]. 

Another abelian idea: the “chirp-z” FFT. The use of subgroups depends 
upon the existence of a nontrivial subgroup. Thus, for a reduction in the case of 
a cyclic group of prime order, a new idea is necessary. In this case, C. Rader’s 
“chirp-z transform” (the “chirp” here refers to radar chirp — the generation of 
an extremely short electromagnetic pulse, i.e., something approaching the ideal 
delta function) may be used [16]. 




